Warning: Permanently added '2620:52:3:1:dead:beef:cafe:c193' (ED25519) to the list of known hosts. You can reproduce this build on your computer by running: sudo dnf install copr-rpmbuild /usr/bin/copr-rpmbuild --verbose --drop-resultdir --task-url https://copr.fedorainfracloud.org/backend/get-build-task/9669522-fedora-43-x86_64 --chroot fedora-43-x86_64 Version: 1.6 PID: 2239 Logging PID: 2241 Task: {'allow_user_ssh': False, 'appstream': False, 'background': False, 'build_id': 9669522, 'buildroot_pkgs': [], 'chroot': 'fedora-43-x86_64', 'enable_net': True, 'fedora_review': False, 'git_hash': '89f896af3614a64b00674555d2017c43824bb700', 'git_repo': 'https://copr-dist-git.fedorainfracloud.org/git/@rocm-packagers-sig/rocm-test-43/rccl', 'isolation': 'default', 'memory_reqs': 2048, 'package_name': 'rccl', 'package_version': '6.4.2-5', 'project_dirname': 'rocm-test-43', 'project_name': 'rocm-test-43', 'project_owner': '@rocm-packagers-sig', 'repo_priority': None, 'repos': [{'baseurl': 'https://download.copr.fedorainfracloud.org/results/@rocm-packagers-sig/rocm-test-43/fedora-43-x86_64/', 'id': 'copr_base', 'name': 'Copr repository', 'priority': None}, {'baseurl': 'https://kojipkgs.fedoraproject.org/repos/f43-build-side-119953/6609762/x86_64', 'id': 'https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64', 'name': 'Additional repo https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64'}], 'sandbox': '@rocm-packagers-sig/rocm-test-43--tflink', 'source_json': {}, 'source_type': None, 'ssh_public_keys': None, 'storage': 0, 'submitter': 'tflink', 'tags': [], 'task_id': '9669522-fedora-43-x86_64', 'timeout': 172800, 'uses_devel_repo': False, 'with_opts': ['test'], 'without_opts': []} Running: git clone https://copr-dist-git.fedorainfracloud.org/git/@rocm-packagers-sig/rocm-test-43/rccl /var/lib/copr-rpmbuild/workspace/workdir-nttra331/rccl --depth 500 --no-single-branch --recursive cmd: ['git', 'clone', 'https://copr-dist-git.fedorainfracloud.org/git/@rocm-packagers-sig/rocm-test-43/rccl', '/var/lib/copr-rpmbuild/workspace/workdir-nttra331/rccl', '--depth', '500', '--no-single-branch', '--recursive'] cwd: . rc: 0 stdout: stderr: Cloning into '/var/lib/copr-rpmbuild/workspace/workdir-nttra331/rccl'... Running: git checkout 89f896af3614a64b00674555d2017c43824bb700 -- cmd: ['git', 'checkout', '89f896af3614a64b00674555d2017c43824bb700', '--'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-nttra331/rccl rc: 0 stdout: stderr: Note: switching to '89f896af3614a64b00674555d2017c43824bb700'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example: git switch -c Or undo this operation with: git switch - Turn off this advice by setting config variable advice.detachedHead to false HEAD is now at 89f896a automatic import of rccl Running: dist-git-client sources cmd: ['dist-git-client', 'sources'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-nttra331/rccl rc: 0 stdout: stderr: INFO: Reading stdout from command: git rev-parse --abbrev-ref HEAD INFO: Reading stdout from command: git rev-parse HEAD INFO: Reading sources specification file: sources INFO: Downloading RCCL-6.4.2.tar.gz INFO: Reading stdout from command: curl --help all INFO: Calling: curl -H Pragma: -o RCCL-6.4.2.tar.gz --location --connect-timeout 60 --retry 3 --retry-delay 10 --remote-time --show-error --fail --retry-all-errors https://copr-dist-git.fedorainfracloud.org/repo/pkgs/@rocm-packagers-sig/rocm-test-43/rccl/RCCL-6.4.2.tar.gz/md5/5323c56546d4e3634f04898820e8816c/RCCL-6.4.2.tar.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1851k 100 1851k 0 0 13.6M 0 --:--:-- --:--:-- --:--:-- 13.6M INFO: Reading stdout from command: md5sum RCCL-6.4.2.tar.gz tail: /var/lib/copr-rpmbuild/main.log: file truncated Running (timeout=172800): unbuffer mock --spec /var/lib/copr-rpmbuild/workspace/workdir-nttra331/rccl/rccl.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-nttra331/rccl --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1759943729.176052 -r /var/lib/copr-rpmbuild/results/configs/child.cfg --with test INFO: mock.py version 6.3 starting (python version = 3.13.7, NVR = mock-6.3-1.fc42), args: /usr/libexec/mock/mock --spec /var/lib/copr-rpmbuild/workspace/workdir-nttra331/rccl/rccl.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-nttra331/rccl --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1759943729.176052 -r /var/lib/copr-rpmbuild/results/configs/child.cfg --with test Start(bootstrap): init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish(bootstrap): init plugins Start: init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish: init plugins INFO: Signal handler active Start: run INFO: Start(/var/lib/copr-rpmbuild/workspace/workdir-nttra331/rccl/rccl.spec) Config(fedora-43-x86_64) Start: clean chroot Finish: clean chroot Mock Version: 6.3 INFO: Mock Version: 6.3 Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1759943729.176052/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata INFO: Guessed host environment type: unknown INFO: Using container image: registry.fedoraproject.org/fedora:43 INFO: Pulling image: registry.fedoraproject.org/fedora:43 INFO: Tagging container image as mock-bootstrap-d86a055b-4728-4925-aafe-8dd76ab2a9b0 INFO: Checking that 020c3172b181f17e045bf1376adb53d2a12d6c2d20f6750d5e31635c21b5056b image matches host's architecture INFO: Copy content of container 020c3172b181f17e045bf1376adb53d2a12d6c2d20f6750d5e31635c21b5056b to /var/lib/mock/fedora-43-x86_64-bootstrap-1759943729.176052/root INFO: mounting 020c3172b181f17e045bf1376adb53d2a12d6c2d20f6750d5e31635c21b5056b with podman image mount INFO: image 020c3172b181f17e045bf1376adb53d2a12d6c2d20f6750d5e31635c21b5056b as /var/lib/containers/storage/overlay/148c1af1b5ba66f3f456be0e6e4d5963a2bb6b0b205fded143fd9fe738cd87f2/merged INFO: umounting image 020c3172b181f17e045bf1376adb53d2a12d6c2d20f6750d5e31635c21b5056b (/var/lib/containers/storage/overlay/148c1af1b5ba66f3f456be0e6e4d5963a2bb6b0b205fded143fd9fe738cd87f2/merged) with podman image umount INFO: Removing image mock-bootstrap-d86a055b-4728-4925-aafe-8dd76ab2a9b0 INFO: Package manager dnf5 detected and used (fallback) INFO: Not updating bootstrap chroot, bootstrap_image_ready=True Start(bootstrap): creating root cache Finish(bootstrap): creating root cache Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-1759943729.176052/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Package manager dnf5 detected and used (direct choice) INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-6.0.0-1.fc43.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 dnf5-5.2.17.0-2.fc43.x86_64 dnf5-plugins-5.2.17.0-2.fc43.x86_64 Start: installing minimal buildroot with dnf5 Updating and loading repositories: Copr repository 100% | 181.8 KiB/s | 24.0 KiB | 00m00s updates 100% | 44.3 KiB/s | 33.3 KiB | 00m01s Additional repo https_kojipkgs_fedorap 100% | 15.5 MiB/s | 14.4 MiB | 00m01s fedora 100% | 19.9 MiB/s | 36.2 MiB | 00m02s Repositories loaded. Package Arch Version Repository Size Installing group/module packages: bash x86_64 5.3.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 8.4 MiB bzip2 x86_64 1.0.8-21.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 95.3 KiB coreutils x86_64 9.7-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 5.4 MiB cpio x86_64 2.15-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.1 MiB diffutils x86_64 3.12-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.6 MiB fedora-release-common noarch 43-0.23 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 20.6 KiB findutils x86_64 1:4.10.0-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.8 MiB gawk x86_64 5.3.2-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.8 MiB glibc-minimal-langpack x86_64 2.42-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 0.0 B grep x86_64 3.12-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.0 MiB gzip x86_64 1.13-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 388.8 KiB info x86_64 7.2-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 353.9 KiB patch x86_64 2.8-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 222.8 KiB redhat-rpm-config noarch 343-11.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 182.9 KiB rpm-build x86_64 6.0.0-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 287.4 KiB sed x86_64 4.9-5.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 857.3 KiB shadow-utils x86_64 2:4.18.0-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.9 MiB tar x86_64 2:1.35-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.9 MiB unzip x86_64 6.0-67.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 386.3 KiB util-linux x86_64 2.41.1-17.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.5 MiB which x86_64 2.23-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 83.5 KiB xz x86_64 1:5.8.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.3 MiB Installing dependencies: add-determinism x86_64 0.6.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.4 MiB alternatives x86_64 1.33-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 62.2 KiB ansible-srpm-macros noarch 1-18.1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 35.7 KiB audit-libs x86_64 4.1.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 378.8 KiB binutils x86_64 2.45-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 26.5 MiB build-reproducibility-srpm-macros noarch 0.6.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 735.0 B bzip2-libs x86_64 1.0.8-21.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 80.6 KiB ca-certificates noarch 2025.2.80_v9.0.304-1.1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.7 MiB coreutils-common x86_64 9.7-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 11.3 MiB crypto-policies noarch 20250714-5.gitcd6043a.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 146.9 KiB curl x86_64 8.15.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 473.6 KiB cyrus-sasl-lib x86_64 2.1.28-33.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.3 MiB debugedit x86_64 5.2-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 214.0 KiB dwz x86_64 0.16-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 287.1 KiB ed x86_64 1.22.2-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 148.1 KiB efi-srpm-macros noarch 6-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 40.1 KiB elfutils x86_64 0.193-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.9 MiB elfutils-debuginfod-client x86_64 0.193-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 83.9 KiB elfutils-default-yama-scope noarch 0.193-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.8 KiB elfutils-libelf x86_64 0.193-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.2 MiB elfutils-libs x86_64 0.193-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 683.4 KiB fedora-gpg-keys noarch 43-0.4 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 131.2 KiB fedora-release noarch 43-0.23 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 0.0 B fedora-release-identity-basic noarch 43-0.23 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 658.0 B fedora-repos noarch 43-0.4 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 4.9 KiB file x86_64 5.46-8.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 100.2 KiB file-libs x86_64 5.46-8.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 11.9 MiB filesystem x86_64 3.18-50.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 112.0 B filesystem-srpm-macros noarch 3.18-50.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 38.2 KiB fonts-srpm-macros noarch 1:2.0.5-23.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 55.8 KiB forge-srpm-macros noarch 0.4.0-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 38.9 KiB fpc-srpm-macros noarch 1.3-15.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 144.0 B gap-srpm-macros noarch 2-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.1 KiB gdb-minimal x86_64 16.3-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 13.3 MiB gdbm-libs x86_64 1:1.23-10.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 129.9 KiB ghc-srpm-macros noarch 1.9.2-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 779.0 B glibc x86_64 2.42-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 6.7 MiB glibc-common x86_64 2.42-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.0 MiB glibc-gconv-extra x86_64 2.42-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 7.2 MiB gmp x86_64 1:6.3.0-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 811.2 KiB gnat-srpm-macros noarch 6-8.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.0 KiB gnulib-l10n noarch 20241231-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 655.0 KiB gnupg2 x86_64 2.4.8-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 6.5 MiB gnupg2-dirmngr x86_64 2.4.8-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 618.4 KiB gnupg2-gpg-agent x86_64 2.4.8-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 671.4 KiB gnupg2-gpgconf x86_64 2.4.8-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 250.0 KiB gnupg2-keyboxd x86_64 2.4.8-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 201.4 KiB gnupg2-verify x86_64 2.4.8-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 348.5 KiB gnutls x86_64 3.8.10-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.8 MiB go-srpm-macros noarch 3.8.0-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 61.9 KiB gpgverify noarch 2.2-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 8.7 KiB ima-evm-utils-libs x86_64 1.6.2-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 60.7 KiB jansson x86_64 2.14-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 89.1 KiB java-srpm-macros noarch 1-7.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 870.0 B json-c x86_64 0.18-7.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 82.7 KiB kernel-srpm-macros noarch 1.0-27.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.9 KiB keyutils-libs x86_64 1.6.3-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 54.3 KiB krb5-libs x86_64 1.21.3-7.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.3 MiB libacl x86_64 2.3.2-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 35.9 KiB libarchive x86_64 3.8.1-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 951.1 KiB libassuan x86_64 2.5.7-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 163.8 KiB libattr x86_64 2.5.2-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 24.4 KiB libblkid x86_64 2.41.1-17.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 262.4 KiB libbrotli x86_64 1.1.0-10.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 833.3 KiB libcap x86_64 2.76-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 209.1 KiB libcap-ng x86_64 0.8.5-8.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 68.9 KiB libcom_err x86_64 1.47.3-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 63.1 KiB libcurl x86_64 8.15.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 903.2 KiB libeconf x86_64 0.7.9-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 64.9 KiB libevent x86_64 2.1.12-16.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 883.1 KiB libfdisk x86_64 2.41.1-17.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 380.4 KiB libffi x86_64 3.5.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 83.6 KiB libfsverity x86_64 1.6-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 28.5 KiB libgcc x86_64 15.2.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 266.6 KiB libgcrypt x86_64 1.11.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.6 MiB libgomp x86_64 15.2.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 541.1 KiB libgpg-error x86_64 1.55-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 915.3 KiB libidn2 x86_64 2.3.8-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 552.5 KiB libksba x86_64 1.6.7-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 398.5 KiB liblastlog2 x86_64 2.41.1-17.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 33.9 KiB libmount x86_64 2.41.1-17.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 372.7 KiB libnghttp2 x86_64 1.66.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 162.2 KiB libpkgconf x86_64 2.3.0-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 78.1 KiB libpsl x86_64 0.21.5-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 76.4 KiB libselinux x86_64 3.9-5.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 193.1 KiB libsemanage x86_64 3.9-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 308.5 KiB libsepol x86_64 3.9-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 822.0 KiB libsmartcols x86_64 2.41.1-17.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 180.5 KiB libssh x86_64 0.11.3-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 567.1 KiB libssh-config noarch 0.11.3-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 277.0 B libstdc++ x86_64 15.2.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.8 MiB libtasn1 x86_64 4.20.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 176.3 KiB libtool-ltdl x86_64 2.5.4-7.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 70.1 KiB libunistring x86_64 1.1-10.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.7 MiB libusb1 x86_64 1.0.29-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 171.3 KiB libuuid x86_64 2.41.1-17.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 37.4 KiB libverto x86_64 0.3.2-11.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 25.4 KiB libxcrypt x86_64 4.4.38-8.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 284.5 KiB libxml2 x86_64 2.12.10-5.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.7 MiB libzstd x86_64 1.5.7-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 799.9 KiB lua-libs x86_64 5.4.8-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 280.8 KiB lua-srpm-macros noarch 1-16.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.3 KiB lz4-libs x86_64 1.10.0-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 161.4 KiB mpfr x86_64 4.2.2-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 832.8 KiB ncurses-base noarch 6.5-7.20250614.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 328.1 KiB ncurses-libs x86_64 6.5-7.20250614.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 946.3 KiB nettle x86_64 3.10.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 790.6 KiB npth x86_64 1.8-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 49.6 KiB ocaml-srpm-macros noarch 11-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.9 KiB openblas-srpm-macros noarch 2-20.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 112.0 B openldap x86_64 2.6.10-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 659.9 KiB openssl-libs x86_64 1:3.5.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 8.9 MiB p11-kit x86_64 0.25.8-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.3 MiB p11-kit-trust x86_64 0.25.8-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 446.5 KiB package-notes-srpm-macros noarch 0.5-14.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.6 KiB pam-libs x86_64 1.7.1-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 126.8 KiB pcre2 x86_64 10.46-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 697.7 KiB pcre2-syntax noarch 10.46-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 275.3 KiB perl-srpm-macros noarch 1-60.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 861.0 B pkgconf x86_64 2.3.0-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 88.5 KiB pkgconf-m4 noarch 2.3.0-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 14.4 KiB pkgconf-pkg-config x86_64 2.3.0-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 989.0 B popt x86_64 1.19-9.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 132.8 KiB publicsuffix-list-dafsa noarch 20250616-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 69.1 KiB pyproject-srpm-macros noarch 1.18.4-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.9 KiB python-srpm-macros noarch 3.14-5.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 51.5 KiB qt5-srpm-macros noarch 5.15.17-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 500.0 B qt6-srpm-macros noarch 6.9.2-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 464.0 B readline x86_64 8.3-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 511.7 KiB rpm x86_64 6.0.0-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.1 MiB rpm-build-libs x86_64 6.0.0-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 268.4 KiB rpm-libs x86_64 6.0.0-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 933.7 KiB rpm-sequoia x86_64 1.9.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.5 MiB rpm-sign-libs x86_64 6.0.0-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 39.7 KiB rust-srpm-macros noarch 26.4-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 4.8 KiB setup noarch 2.15.0-26.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 725.0 KiB sqlite-libs x86_64 3.50.2-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.5 MiB systemd-libs x86_64 258-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.3 MiB systemd-standalone-sysusers x86_64 258-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 293.5 KiB tpm2-tss x86_64 4.1.3-8.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.6 MiB tree-sitter-srpm-macros noarch 0.4.2-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 8.3 KiB util-linux-core x86_64 2.41.1-17.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.5 MiB xxhash-libs x86_64 0.8.3-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 90.2 KiB xz-libs x86_64 1:5.8.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 217.8 KiB zig-srpm-macros noarch 1-5.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.1 KiB zip x86_64 3.0-44.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 694.5 KiB zlib-ng-compat x86_64 2.2.5-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 137.6 KiB zstd x86_64 1.5.7-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.7 MiB Installing groups: Buildsystem building group Transaction Summary: Installing: 170 packages Total size of inbound packages is 59 MiB. Need to download 59 MiB. After this operation, 199 MiB extra will be used (install 199 MiB, remove 0 B). [ 1/170] bzip2-0:1.0.8-21.fc43.x86_64 100% | 465.1 KiB/s | 51.6 KiB | 00m00s [ 2/170] cpio-0:2.15-6.fc43.x86_64 100% | 2.7 MiB/s | 286.6 KiB | 00m00s [ 3/170] diffutils-0:3.12-3.fc43.x86_6 100% | 6.7 MiB/s | 384.1 KiB | 00m00s [ 4/170] fedora-release-common-0:43-0. 100% | 1.2 MiB/s | 24.8 KiB | 00m00s [ 5/170] coreutils-0:9.7-6.fc43.x86_64 100% | 3.2 MiB/s | 1.1 MiB | 00m00s [ 6/170] gawk-0:5.3.2-2.fc43.x86_64 100% | 6.0 MiB/s | 1.1 MiB | 00m00s [ 7/170] findutils-1:4.10.0-6.fc43.x86 100% | 101.6 KiB/s | 541.2 KiB | 00m05s [ 8/170] glibc-minimal-langpack-0:2.42 100% | 7.4 KiB/s | 38.3 KiB | 00m05s [ 9/170] grep-0:3.12-2.fc43.x86_64 100% | 1.7 MiB/s | 289.1 KiB | 00m00s [ 10/170] info-0:7.2-6.fc43.x86_64 100% | 1.9 MiB/s | 182.9 KiB | 00m00s [ 11/170] patch-0:2.8-2.fc43.x86_64 100% | 1.9 MiB/s | 113.8 KiB | 00m00s [ 12/170] gzip-0:1.13-4.fc43.x86_64 100% | 752.1 KiB/s | 164.0 KiB | 00m00s [ 13/170] rpm-build-0:6.0.0-1.fc43.x86_ 100% | 1.0 MiB/s | 130.0 KiB | 00m00s [ 14/170] sed-0:4.9-5.fc43.x86_64 100% | 1.4 MiB/s | 308.6 KiB | 00m00s [ 15/170] shadow-utils-2:4.18.0-3.fc43. 100% | 982.1 KiB/s | 1.2 MiB | 00m01s [ 16/170] tar-2:1.35-6.fc43.x86_64 100% | 2.5 MiB/s | 847.8 KiB | 00m00s [ 17/170] unzip-0:6.0-67.fc43.x86_64 100% | 3.3 MiB/s | 183.7 KiB | 00m00s [ 18/170] util-linux-0:2.41.1-17.fc43.x 100% | 5.4 MiB/s | 1.1 MiB | 00m00s [ 19/170] which-0:2.23-3.fc43.x86_64 100% | 1.9 MiB/s | 41.7 KiB | 00m00s [ 20/170] xz-1:5.8.1-2.fc43.x86_64 100% | 6.4 MiB/s | 556.6 KiB | 00m00s [ 21/170] glibc-0:2.42-4.fc43.x86_64 100% | 10.8 MiB/s | 2.2 MiB | 00m00s [ 22/170] xz-libs-1:5.8.1-2.fc43.x86_64 100% | 4.4 MiB/s | 112.9 KiB | 00m00s [ 23/170] audit-libs-0:4.1.1-2.fc43.x86 100% | 5.6 MiB/s | 138.5 KiB | 00m00s [ 24/170] filesystem-0:3.18-50.fc43.x86 100% | 8.2 MiB/s | 1.3 MiB | 00m00s [ 25/170] libblkid-0:2.41.1-17.fc43.x86 100% | 5.7 MiB/s | 123.1 KiB | 00m00s [ 26/170] bash-0:5.3.0-2.fc43.x86_64 100% | 184.0 KiB/s | 1.8 MiB | 00m10s [ 27/170] libfdisk-0:2.41.1-17.fc43.x86 100% | 6.9 MiB/s | 161.3 KiB | 00m00s [ 28/170] libgcc-0:15.2.1-2.fc43.x86_64 100% | 5.4 MiB/s | 133.0 KiB | 00m00s [ 29/170] liblastlog2-0:2.41.1-17.fc43. 100% | 1.1 MiB/s | 23.2 KiB | 00m00s [ 30/170] libmount-0:2.41.1-17.fc43.x86 100% | 6.6 MiB/s | 162.5 KiB | 00m00s [ 31/170] libselinux-0:3.9-5.fc43.x86_6 100% | 4.5 MiB/s | 97.7 KiB | 00m00s [ 32/170] libsmartcols-0:2.41.1-17.fc43 100% | 3.9 MiB/s | 84.0 KiB | 00m00s [ 33/170] libuuid-0:2.41.1-17.fc43.x86_ 100% | 1.3 MiB/s | 26.2 KiB | 00m00s [ 34/170] ncurses-libs-0:6.5-7.20250614 100% | 13.5 MiB/s | 332.7 KiB | 00m00s [ 35/170] pam-libs-0:1.7.1-3.fc43.x86_6 100% | 2.7 MiB/s | 57.5 KiB | 00m00s [ 36/170] readline-0:8.3-2.fc43.x86_64 100% | 9.5 MiB/s | 224.6 KiB | 00m00s [ 37/170] systemd-libs-0:258-1.fc43.x86 100% | 26.7 MiB/s | 819.8 KiB | 00m00s [ 38/170] util-linux-core-0:2.41.1-17.f 100% | 20.1 MiB/s | 534.9 KiB | 00m00s [ 39/170] zlib-ng-compat-0:2.2.5-2.fc43 100% | 3.5 MiB/s | 79.2 KiB | 00m00s [ 40/170] bzip2-libs-0:1.0.8-21.fc43.x8 100% | 2.1 MiB/s | 43.1 KiB | 00m00s [ 41/170] libacl-0:2.3.2-4.fc43.x86_64 100% | 1.1 MiB/s | 24.3 KiB | 00m00s [ 42/170] libcap-0:2.76-3.fc43.x86_64 100% | 4.0 MiB/s | 86.9 KiB | 00m00s [ 43/170] libeconf-0:0.7.9-2.fc43.x86_6 100% | 1.6 MiB/s | 35.2 KiB | 00m00s [ 44/170] libsemanage-0:3.9-4.fc43.x86_ 100% | 5.7 MiB/s | 123.5 KiB | 00m00s [ 45/170] libxcrypt-0:4.4.38-8.fc43.x86 100% | 5.9 MiB/s | 127.0 KiB | 00m00s [ 46/170] setup-0:2.15.0-26.fc43.noarch 100% | 6.4 MiB/s | 151.2 KiB | 00m00s [ 47/170] binutils-0:2.45-1.fc43.x86_64 100% | 52.2 MiB/s | 5.8 MiB | 00m00s [ 48/170] debugedit-0:5.2-3.fc43.x86_64 100% | 3.8 MiB/s | 85.6 KiB | 00m00s [ 49/170] elfutils-0:0.193-3.fc43.x86_6 100% | 20.5 MiB/s | 566.1 KiB | 00m00s [ 50/170] elfutils-libelf-0:0.193-3.fc4 100% | 9.2 MiB/s | 207.8 KiB | 00m00s [ 51/170] file-0:5.46-8.fc43.x86_64 100% | 2.4 MiB/s | 48.8 KiB | 00m00s [ 52/170] libarchive-0:3.8.1-3.fc43.x86 100% | 16.5 MiB/s | 421.1 KiB | 00m00s [ 53/170] redhat-rpm-config-0:343-11.fc 100% | 14.4 KiB/s | 72.9 KiB | 00m05s [ 54/170] libstdc++-0:15.2.1-2.fc43.x86 100% | 29.0 MiB/s | 920.1 KiB | 00m00s [ 55/170] pkgconf-pkg-config-0:2.3.0-3. 100% | 457.5 KiB/s | 9.6 KiB | 00m00s [ 56/170] popt-0:1.19-9.fc43.x86_64 100% | 2.7 MiB/s | 59.1 KiB | 00m00s [ 57/170] rpm-build-libs-0:6.0.0-1.fc43 100% | 5.9 MiB/s | 127.9 KiB | 00m00s [ 58/170] rpm-libs-0:6.0.0-1.fc43.x86_6 100% | 15.0 MiB/s | 400.2 KiB | 00m00s [ 59/170] zstd-0:1.5.7-2.fc43.x86_64 100% | 14.4 MiB/s | 485.9 KiB | 00m00s [ 60/170] curl-0:8.15.0-2.fc43.x86_64 100% | 9.1 MiB/s | 233.7 KiB | 00m00s [ 61/170] glibc-gconv-extra-0:2.42-4.fc 100% | 13.1 MiB/s | 1.5 MiB | 00m00s [ 62/170] ansible-srpm-macros-0:1-18.1. 100% | 829.5 KiB/s | 19.9 KiB | 00m00s [ 63/170] build-reproducibility-srpm-ma 100% | 563.0 KiB/s | 11.8 KiB | 00m00s [ 64/170] rpm-0:6.0.0-1.fc43.x86_64 100% | 1.7 MiB/s | 545.3 KiB | 00m00s [ 65/170] dwz-0:0.16-2.fc43.x86_64 100% | 5.5 MiB/s | 135.5 KiB | 00m00s [ 66/170] efi-srpm-macros-0:6-4.fc43.no 100% | 861.3 KiB/s | 22.4 KiB | 00m00s [ 67/170] filesystem-srpm-macros-0:3.18 100% | 1.0 MiB/s | 26.4 KiB | 00m00s [ 68/170] forge-srpm-macros-0:0.4.0-3.f 100% | 772.6 KiB/s | 20.1 KiB | 00m00s [ 69/170] fonts-srpm-macros-1:2.0.5-23. 100% | 631.9 KiB/s | 27.2 KiB | 00m00s [ 70/170] fpc-srpm-macros-0:1.3-15.fc43 100% | 394.6 KiB/s | 7.9 KiB | 00m00s [ 71/170] gap-srpm-macros-0:2-1.fc43.no 100% | 407.6 KiB/s | 9.0 KiB | 00m00s [ 72/170] ghc-srpm-macros-0:1.9.2-3.fc4 100% | 416.5 KiB/s | 8.7 KiB | 00m00s [ 73/170] gnat-srpm-macros-0:6-8.fc43.n 100% | 385.7 KiB/s | 8.5 KiB | 00m00s [ 74/170] go-srpm-macros-0:3.8.0-1.fc43 100% | 1.2 MiB/s | 28.3 KiB | 00m00s [ 75/170] java-srpm-macros-0:1-7.fc43.n 100% | 378.3 KiB/s | 7.9 KiB | 00m00s [ 76/170] kernel-srpm-macros-0:1.0-27.f 100% | 469.5 KiB/s | 8.9 KiB | 00m00s [ 77/170] lua-srpm-macros-0:1-16.fc43.n 100% | 437.8 KiB/s | 8.8 KiB | 00m00s [ 78/170] ocaml-srpm-macros-0:11-2.fc43 100% | 463.0 KiB/s | 9.3 KiB | 00m00s [ 79/170] openblas-srpm-macros-0:2-20.f 100% | 379.7 KiB/s | 7.6 KiB | 00m00s [ 80/170] package-notes-srpm-macros-0:0 100% | 449.3 KiB/s | 9.0 KiB | 00m00s [ 81/170] perl-srpm-macros-0:1-60.fc43. 100% | 414.5 KiB/s | 8.3 KiB | 00m00s [ 82/170] pyproject-srpm-macros-0:1.18. 100% | 684.7 KiB/s | 13.7 KiB | 00m00s [ 83/170] python-srpm-macros-0:3.14-5.f 100% | 1.1 MiB/s | 23.4 KiB | 00m00s [ 84/170] qt5-srpm-macros-0:5.15.17-2.f 100% | 433.1 KiB/s | 8.7 KiB | 00m00s [ 85/170] qt6-srpm-macros-0:6.9.2-1.fc4 100% | 469.3 KiB/s | 9.4 KiB | 00m00s [ 86/170] rust-srpm-macros-0:26.4-1.fc4 100% | 555.9 KiB/s | 11.1 KiB | 00m00s [ 87/170] tree-sitter-srpm-macros-0:0.4 100% | 667.5 KiB/s | 13.4 KiB | 00m00s [ 88/170] zig-srpm-macros-0:1-5.fc43.no 100% | 383.4 KiB/s | 8.4 KiB | 00m00s [ 89/170] pkgconf-0:2.3.0-3.fc43.x86_64 100% | 2.2 MiB/s | 44.6 KiB | 00m00s [ 90/170] pkgconf-m4-0:2.3.0-3.fc43.noa 100% | 695.6 KiB/s | 13.9 KiB | 00m00s [ 91/170] libpkgconf-0:2.3.0-3.fc43.x86 100% | 1.6 MiB/s | 37.9 KiB | 00m00s [ 92/170] ed-0:1.22.2-1.fc43.x86_64 100% | 3.9 MiB/s | 83.7 KiB | 00m00s [ 93/170] libattr-0:2.5.2-6.fc43.x86_64 100% | 892.6 KiB/s | 17.9 KiB | 00m00s [ 94/170] ncurses-base-0:6.5-7.20250614 100% | 3.0 MiB/s | 63.7 KiB | 00m00s [ 95/170] libsepol-0:3.9-2.fc43.x86_64 100% | 11.6 MiB/s | 345.4 KiB | 00m00s [ 96/170] zip-0:3.0-44.fc43.x86_64 100% | 1.4 MiB/s | 261.6 KiB | 00m00s [ 97/170] pcre2-0:10.46-1.fc43.x86_64 100% | 10.7 MiB/s | 262.2 KiB | 00m00s [ 98/170] libxml2-0:2.12.10-5.fc43.x86_ 100% | 19.3 MiB/s | 692.7 KiB | 00m00s [ 99/170] libzstd-0:1.5.7-2.fc43.x86_64 100% | 11.4 MiB/s | 314.6 KiB | 00m00s [100/170] lz4-libs-0:1.10.0-3.fc43.x86_ 100% | 2.5 MiB/s | 78.0 KiB | 00m00s [101/170] openssl-libs-1:3.5.1-2.fc43.x 100% | 18.6 MiB/s | 2.6 MiB | 00m00s [102/170] glibc-common-0:2.42-4.fc43.x8 100% | 12.0 MiB/s | 320.4 KiB | 00m00s [103/170] gmp-1:6.3.0-4.fc43.x86_64 100% | 2.0 MiB/s | 319.3 KiB | 00m00s [104/170] mpfr-0:4.2.2-2.fc43.x86_64 100% | 12.1 MiB/s | 347.0 KiB | 00m00s [105/170] file-libs-0:5.46-8.fc43.x86_6 100% | 3.9 MiB/s | 850.3 KiB | 00m00s [106/170] fedora-repos-0:43-0.4.noarch 100% | 302.8 KiB/s | 9.1 KiB | 00m00s [107/170] elfutils-debuginfod-client-0: 100% | 368.7 KiB/s | 46.8 KiB | 00m00s [108/170] elfutils-libs-0:0.193-3.fc43. 100% | 6.4 MiB/s | 269.7 KiB | 00m00s [109/170] sqlite-libs-0:3.50.2-2.fc43.x 100% | 784.9 KiB/s | 760.5 KiB | 00m01s [110/170] coreutils-common-0:9.7-6.fc43 100% | 8.4 MiB/s | 2.1 MiB | 00m00s [111/170] alternatives-0:1.33-2.fc43.x8 100% | 1.8 MiB/s | 40.7 KiB | 00m00s [112/170] jansson-0:2.14-3.fc43.x86_64 100% | 2.1 MiB/s | 45.3 KiB | 00m00s [113/170] lua-libs-0:5.4.8-2.fc43.x86_6 100% | 5.8 MiB/s | 131.7 KiB | 00m00s [114/170] rpm-sequoia-0:1.9.0-2.fc43.x8 100% | 7.9 MiB/s | 933.3 KiB | 00m00s [115/170] libgomp-0:15.2.1-2.fc43.x86_6 100% | 6.3 MiB/s | 372.9 KiB | 00m00s [116/170] rpm-sign-libs-0:6.0.0-1.fc43. 100% | 1.4 MiB/s | 28.2 KiB | 00m00s [117/170] pcre2-syntax-0:10.46-1.fc43.n 100% | 7.2 MiB/s | 162.2 KiB | 00m00s [118/170] ca-certificates-0:2025.2.80_v 100% | 8.5 MiB/s | 975.4 KiB | 00m00s [119/170] crypto-policies-0:20250714-5. 100% | 3.3 MiB/s | 75.0 KiB | 00m00s [120/170] fedora-gpg-keys-0:43-0.4.noar 100% | 5.9 MiB/s | 127.7 KiB | 00m00s [121/170] elfutils-default-yama-scope-0 100% | 540.2 KiB/s | 12.4 KiB | 00m00s [122/170] json-c-0:0.18-7.fc43.x86_64 100% | 2.2 MiB/s | 45.0 KiB | 00m00s [123/170] gnulib-l10n-0:20241231-1.fc43 100% | 6.1 MiB/s | 143.0 KiB | 00m00s [124/170] add-determinism-0:0.6.0-2.fc4 100% | 1.4 MiB/s | 919.3 KiB | 00m01s [125/170] libffi-0:3.5.1-2.fc43.x86_64 100% | 1.9 MiB/s | 40.9 KiB | 00m00s [126/170] p11-kit-trust-0:0.25.8-1.fc43 100% | 6.5 MiB/s | 139.6 KiB | 00m00s [127/170] gnupg2-0:2.4.8-4.fc43.x86_64 100% | 9.2 MiB/s | 1.6 MiB | 00m00s [128/170] p11-kit-0:0.25.8-1.fc43.x86_6 100% | 2.1 MiB/s | 490.0 KiB | 00m00s [129/170] ima-evm-utils-libs-0:1.6.2-6. 100% | 1.2 MiB/s | 29.3 KiB | 00m00s [130/170] libfsverity-0:1.6-3.fc43.x86_ 100% | 931.4 KiB/s | 18.6 KiB | 00m00s [131/170] gpgverify-0:2.2-3.fc43.noarch 100% | 528.7 KiB/s | 11.1 KiB | 00m00s [132/170] libtasn1-0:4.20.0-2.fc43.x86_ 100% | 1.9 MiB/s | 74.5 KiB | 00m00s [133/170] tpm2-tss-0:4.1.3-8.fc43.x86_6 100% | 6.9 MiB/s | 421.3 KiB | 00m00s [134/170] libassuan-0:2.5.7-4.fc43.x86_ 100% | 2.4 MiB/s | 67.4 KiB | 00m00s [135/170] libcap-ng-0:0.8.5-8.fc43.x86_ 100% | 6.4 KiB/s | 32.1 KiB | 00m05s [136/170] gnupg2-verify-0:2.4.8-4.fc43. 100% | 1.9 MiB/s | 171.2 KiB | 00m00s [137/170] npth-0:1.8-3.fc43.x86_64 100% | 1.0 MiB/s | 25.7 KiB | 00m00s [138/170] libgpg-error-0:1.55-2.fc43.x8 100% | 3.7 MiB/s | 239.1 KiB | 00m00s [139/170] libgcrypt-0:1.11.1-2.fc43.x86 100% | 7.0 MiB/s | 595.8 KiB | 00m00s [140/170] gnupg2-gpgconf-0:2.4.8-4.fc43 100% | 4.3 MiB/s | 115.0 KiB | 00m00s [141/170] gnupg2-gpg-agent-0:2.4.8-4.fc 100% | 6.2 MiB/s | 272.9 KiB | 00m00s [142/170] gnupg2-keyboxd-0:2.4.8-4.fc43 100% | 3.7 MiB/s | 94.7 KiB | 00m00s [143/170] libusb1-0:1.0.29-4.fc43.x86_6 100% | 2.9 MiB/s | 79.9 KiB | 00m00s [144/170] libksba-0:1.6.7-4.fc43.x86_64 100% | 6.0 MiB/s | 160.4 KiB | 00m00s [145/170] gnupg2-dirmngr-0:2.4.8-4.fc43 100% | 2.3 MiB/s | 274.6 KiB | 00m00s [146/170] openldap-0:2.6.10-4.fc43.x86_ 100% | 6.0 MiB/s | 259.6 KiB | 00m00s [147/170] libevent-0:2.1.12-16.fc43.x86 100% | 6.3 MiB/s | 257.8 KiB | 00m00s [148/170] libtool-ltdl-0:2.5.4-7.fc43.x 100% | 1.6 MiB/s | 36.2 KiB | 00m00s [149/170] gnutls-0:3.8.10-3.fc43.x86_64 100% | 10.4 MiB/s | 1.4 MiB | 00m00s [150/170] libidn2-0:2.3.8-2.fc43.x86_64 100% | 7.9 MiB/s | 168.9 KiB | 00m00s [151/170] libunistring-0:1.1-10.fc43.x8 100% | 8.0 MiB/s | 542.9 KiB | 00m00s [152/170] nettle-0:3.10.1-2.fc43.x86_64 100% | 7.0 MiB/s | 424.2 KiB | 00m00s [153/170] gdbm-libs-1:1.23-10.fc43.x86_ 100% | 2.6 MiB/s | 56.8 KiB | 00m00s [154/170] fedora-release-0:43-0.23.noar 100% | 686.2 KiB/s | 13.7 KiB | 00m00s [155/170] systemd-standalone-sysusers-0 100% | 6.1 MiB/s | 143.8 KiB | 00m00s [156/170] xxhash-libs-0:0.8.3-3.fc43.x8 100% | 1.6 MiB/s | 38.5 KiB | 00m00s [157/170] fedora-release-identity-basic 100% | 630.3 KiB/s | 14.5 KiB | 00m00s [158/170] cyrus-sasl-lib-0:2.1.28-33.fc 100% | 2.7 MiB/s | 787.9 KiB | 00m00s [159/170] libcurl-0:8.15.0-2.fc43.x86_6 100% | 9.0 MiB/s | 404.3 KiB | 00m00s [160/170] libbrotli-0:1.1.0-10.fc43.x86 100% | 8.1 MiB/s | 339.1 KiB | 00m00s [161/170] libnghttp2-0:1.66.0-2.fc43.x8 100% | 2.6 MiB/s | 72.5 KiB | 00m00s [162/170] libpsl-0:0.21.5-6.fc43.x86_64 100% | 2.9 MiB/s | 65.0 KiB | 00m00s [163/170] libssh-0:0.11.3-1.fc43.x86_64 100% | 6.0 MiB/s | 232.8 KiB | 00m00s [164/170] keyutils-libs-0:1.6.3-6.fc43. 100% | 1.4 MiB/s | 31.4 KiB | 00m00s [165/170] libcom_err-0:1.47.3-2.fc43.x8 100% | 1.3 MiB/s | 26.8 KiB | 00m00s [166/170] krb5-libs-0:1.21.3-7.fc43.x86 100% | 3.3 MiB/s | 754.8 KiB | 00m00s [167/170] libssh-config-0:0.11.3-1.fc43 100% | 455.6 KiB/s | 9.1 KiB | 00m00s [168/170] publicsuffix-list-dafsa-0:202 100% | 2.8 MiB/s | 59.2 KiB | 00m00s [169/170] gdb-minimal-0:16.3-6.fc43.x86 100% | 10.1 MiB/s | 4.4 MiB | 00m00s [170/170] libverto-0:0.3.2-11.fc43.x86_ 100% | 4.1 KiB/s | 20.7 KiB | 00m05s -------------------------------------------------------------------------------- [170/170] Total 100% | 3.0 MiB/s | 58.6 MiB | 00m19s Running transaction [ 1/172] Verify package files 100% | 641.0 B/s | 170.0 B | 00m00s [ 2/172] Prepare transaction 100% | 1.8 KiB/s | 170.0 B | 00m00s [ 3/172] Installing libgcc-0:15.2.1-2. 100% | 131.0 MiB/s | 268.3 KiB | 00m00s [ 4/172] Installing publicsuffix-list- 100% | 68.2 MiB/s | 69.8 KiB | 00m00s [ 5/172] Installing libssh-config-0:0. 100% | 0.0 B/s | 816.0 B | 00m00s [ 6/172] Installing fedora-release-ide 100% | 894.5 KiB/s | 916.0 B | 00m00s [ 7/172] Installing fedora-gpg-keys-0: 100% | 19.4 MiB/s | 179.0 KiB | 00m00s [ 8/172] Installing fedora-repos-0:43- 100% | 5.6 MiB/s | 5.7 KiB | 00m00s [ 9/172] Installing fedora-release-com 100% | 12.1 MiB/s | 24.9 KiB | 00m00s [ 10/172] Installing fedora-release-0:4 100% | 6.1 KiB/s | 124.0 B | 00m00s >>> Running sysusers scriptlet: setup-0:2.15.0-26.fc43.noarch >>> Finished sysusers scriptlet: setup-0:2.15.0-26.fc43.noarch >>> Scriptlet output: >>> Creating group 'adm' with GID 4. >>> Creating group 'audio' with GID 63. >>> Creating group 'cdrom' with GID 11. >>> Creating group 'clock' with GID 103. >>> Creating group 'dialout' with GID 18. >>> Creating group 'disk' with GID 6. >>> Creating group 'floppy' with GID 19. >>> Creating group 'ftp' with GID 50. >>> Creating group 'games' with GID 20. >>> Creating group 'input' with GID 104. >>> Creating group 'kmem' with GID 9. >>> Creating group 'kvm' with GID 36. >>> Creating group 'lock' with GID 54. >>> Creating group 'lp' with GID 7. >>> Creating group 'mail' with GID 12. >>> Creating group 'man' with GID 15. >>> Creating group 'mem' with GID 8. >>> Creating group 'nobody' with GID 65534. >>> Creating group 'render' with GID 105. >>> Creating group 'root' with GID 0. >>> Creating group 'sgx' with GID 106. >>> Creating group 'sys' with GID 3. >>> Creating group 'tape' with GID 33. >>> Creating group 'tty' with GID 5. >>> Creating group 'users' with GID 100. >>> Creating group 'utmp' with GID 22. >>> Creating group 'video' with GID 39. >>> Creating group 'wheel' with GID 10. >>> Creating user 'adm' (adm) with UID 3 and GID 4. >>> Creating group 'bin' with GID 1. >>> Creating user 'bin' (bin) with UID 1 and GID 1. >>> Creating group 'daemon' with GID 2. >>> Creating user 'daemon' (daemon) with UID 2 and GID 2. >>> Creating user 'ftp' (FTP User) with UID 14 and GID 50. >>> Creating user 'games' (games) with UID 12 and GID 100. >>> Creating user 'halt' (halt) with UID 7 and GID 0. >>> Creating user 'lp' (lp) with UID 4 and GID 7. >>> Creating user 'mail' (mail) with UID 8 and GID 12. >>> Creating user 'nobody' (Kernel Overflow User) with UID 65534 and GID 65534. >>> Creating user 'operator' (operator) with UID 11 and GID 0. >>> Creating user 'root' (Super User) with UID 0 and GID 0. >>> Creating user 'shutdown' (shutdown) with UID 6 and GID 0. >>> Creating user 'sync' (sync) with UID 5 and GID 0. >>> [ 11/172] Installing setup-0:2.15.0-26. 100% | 39.6 MiB/s | 730.6 KiB | 00m00s [ 12/172] Installing filesystem-0:3.18- 100% | 1.2 MiB/s | 212.8 KiB | 00m00s [ 13/172] Installing gnulib-l10n-0:2024 100% | 71.8 MiB/s | 661.9 KiB | 00m00s [ 14/172] Installing coreutils-common-0 100% | 182.1 MiB/s | 11.3 MiB | 00m00s [ 15/172] Installing pcre2-syntax-0:10. 100% | 135.6 MiB/s | 277.8 KiB | 00m00s [ 16/172] Installing ncurses-base-0:6.5 100% | 34.5 MiB/s | 353.5 KiB | 00m00s [ 17/172] Installing bash-0:5.3.0-2.fc4 100% | 191.6 MiB/s | 8.4 MiB | 00m00s [ 18/172] Installing glibc-common-0:2.4 100% | 53.7 MiB/s | 1.0 MiB | 00m00s [ 19/172] Installing glibc-gconv-extra- 100% | 132.9 MiB/s | 7.3 MiB | 00m00s [ 20/172] Installing glibc-0:2.42-4.fc4 100% | 117.6 MiB/s | 6.7 MiB | 00m00s [ 21/172] Installing ncurses-libs-0:6.5 100% | 116.3 MiB/s | 952.8 KiB | 00m00s [ 22/172] Installing glibc-minimal-lang 100% | 121.1 KiB/s | 124.0 B | 00m00s [ 23/172] Installing zlib-ng-compat-0:2 100% | 67.6 MiB/s | 138.4 KiB | 00m00s [ 24/172] Installing bzip2-libs-0:1.0.8 100% | 79.8 MiB/s | 81.7 KiB | 00m00s [ 25/172] Installing libgpg-error-0:1.5 100% | 39.1 MiB/s | 921.1 KiB | 00m00s [ 26/172] Installing libstdc++-0:15.2.1 100% | 177.7 MiB/s | 2.8 MiB | 00m00s [ 27/172] Installing xz-libs-1:5.8.1-2. 100% | 106.9 MiB/s | 218.9 KiB | 00m00s [ 28/172] Installing libassuan-0:2.5.7- 100% | 80.9 MiB/s | 165.6 KiB | 00m00s [ 29/172] Installing libgcrypt-0:1.11.1 100% | 196.9 MiB/s | 1.6 MiB | 00m00s [ 30/172] Installing readline-0:8.3-2.f 100% | 125.5 MiB/s | 513.9 KiB | 00m00s [ 31/172] Installing libuuid-0:2.41.1-1 100% | 37.6 MiB/s | 38.5 KiB | 00m00s [ 32/172] Installing gmp-1:6.3.0-4.fc43 100% | 158.9 MiB/s | 813.5 KiB | 00m00s [ 33/172] Installing popt-0:1.19-9.fc43 100% | 19.4 MiB/s | 139.4 KiB | 00m00s [ 34/172] Installing npth-0:1.8-3.fc43. 100% | 24.8 MiB/s | 50.7 KiB | 00m00s [ 35/172] Installing libblkid-0:2.41.1- 100% | 85.7 MiB/s | 263.4 KiB | 00m00s [ 36/172] Installing libxcrypt-0:4.4.38 100% | 93.5 MiB/s | 287.2 KiB | 00m00s [ 37/172] Installing sqlite-libs-0:3.50 100% | 168.5 MiB/s | 1.5 MiB | 00m00s [ 38/172] Installing libzstd-0:1.5.7-2. 100% | 195.6 MiB/s | 801.1 KiB | 00m00s [ 39/172] Installing elfutils-libelf-0: 100% | 194.4 MiB/s | 1.2 MiB | 00m00s [ 40/172] Installing gnupg2-gpgconf-0:2 100% | 13.7 MiB/s | 252.0 KiB | 00m00s [ 41/172] Installing libattr-0:2.5.2-6. 100% | 24.8 MiB/s | 25.4 KiB | 00m00s [ 42/172] Installing libacl-0:2.3.2-4.f 100% | 35.9 MiB/s | 36.8 KiB | 00m00s [ 43/172] Installing libtasn1-0:4.20.0- 100% | 87.0 MiB/s | 178.1 KiB | 00m00s [ 44/172] Installing libunistring-0:1.1 100% | 191.9 MiB/s | 1.7 MiB | 00m00s [ 45/172] Installing libidn2-0:2.3.8-2. 100% | 19.5 MiB/s | 558.7 KiB | 00m00s [ 46/172] Installing crypto-policies-0: 100% | 11.2 MiB/s | 172.0 KiB | 00m00s [ 47/172] Installing dwz-0:0.16-2.fc43. 100% | 11.7 MiB/s | 288.5 KiB | 00m00s [ 48/172] Installing gnupg2-verify-0:2. 100% | 19.0 MiB/s | 349.9 KiB | 00m00s [ 49/172] Installing mpfr-0:4.2.2-2.fc4 100% | 163.0 MiB/s | 834.4 KiB | 00m00s [ 50/172] Installing gawk-0:5.3.2-2.fc4 100% | 60.5 MiB/s | 1.8 MiB | 00m00s [ 51/172] Installing libksba-0:1.6.7-4. 100% | 97.9 MiB/s | 401.1 KiB | 00m00s [ 52/172] Installing unzip-0:6.0-67.fc4 100% | 20.0 MiB/s | 389.8 KiB | 00m00s [ 53/172] Installing file-libs-0:5.46-8 100% | 359.3 MiB/s | 11.9 MiB | 00m00s [ 54/172] Installing file-0:5.46-8.fc43 100% | 5.8 MiB/s | 101.7 KiB | 00m00s [ 55/172] Installing libcap-ng-0:0.8.5- 100% | 34.6 MiB/s | 70.8 KiB | 00m00s [ 56/172] Installing audit-libs-0:4.1.1 100% | 124.2 MiB/s | 381.5 KiB | 00m00s [ 57/172] Installing libsmartcols-0:2.4 100% | 88.7 MiB/s | 181.6 KiB | 00m00s [ 58/172] Installing libeconf-0:0.7.9-2 100% | 32.5 MiB/s | 66.5 KiB | 00m00s [ 59/172] Installing pam-libs-0:1.7.1-3 100% | 63.0 MiB/s | 129.0 KiB | 00m00s [ 60/172] Installing libcap-0:2.76-3.fc 100% | 11.0 MiB/s | 214.3 KiB | 00m00s [ 61/172] Installing systemd-libs-0:258 100% | 193.7 MiB/s | 2.3 MiB | 00m00s [ 62/172] Installing libsepol-0:3.9-2.f 100% | 160.7 MiB/s | 822.9 KiB | 00m00s [ 63/172] Installing pcre2-0:10.46-1.fc 100% | 170.7 MiB/s | 699.1 KiB | 00m00s [ 64/172] Installing libselinux-0:3.9-5 100% | 94.9 MiB/s | 194.4 KiB | 00m00s [ 65/172] Installing grep-0:3.12-2.fc43 100% | 32.3 MiB/s | 1.0 MiB | 00m00s [ 66/172] Installing findutils-1:4.10.0 100% | 64.1 MiB/s | 1.9 MiB | 00m00s [ 67/172] Installing sed-0:4.9-5.fc43.x 100% | 32.5 MiB/s | 865.5 KiB | 00m00s [ 68/172] Installing xz-1:5.8.1-2.fc43. 100% | 41.6 MiB/s | 1.3 MiB | 00m00s [ 69/172] Installing libmount-0:2.41.1- 100% | 121.6 MiB/s | 373.7 KiB | 00m00s [ 70/172] Installing lz4-libs-0:1.10.0- 100% | 79.3 MiB/s | 162.5 KiB | 00m00s [ 71/172] Installing alternatives-0:1.3 100% | 3.5 MiB/s | 63.8 KiB | 00m00s [ 72/172] Installing lua-libs-0:5.4.8-2 100% | 91.8 MiB/s | 281.9 KiB | 00m00s [ 73/172] Installing json-c-0:0.18-7.fc 100% | 27.3 MiB/s | 84.0 KiB | 00m00s [ 74/172] Installing libffi-0:3.5.1-2.f 100% | 41.5 MiB/s | 85.0 KiB | 00m00s [ 75/172] Installing p11-kit-0:0.25.8-1 100% | 67.4 MiB/s | 2.3 MiB | 00m00s [ 76/172] Installing p11-kit-trust-0:0. 100% | 9.7 MiB/s | 448.2 KiB | 00m00s [ 77/172] Installing openssl-libs-1:3.5 100% | 197.8 MiB/s | 8.9 MiB | 00m00s [ 78/172] Installing coreutils-0:9.7-6. 100% | 74.7 MiB/s | 5.5 MiB | 00m00s [ 79/172] Installing ca-certificates-0: 100% | 1.0 MiB/s | 2.5 MiB | 00m02s [ 80/172] Installing gzip-0:1.13-4.fc43 100% | 25.7 MiB/s | 394.4 KiB | 00m00s [ 81/172] Installing rpm-sequoia-0:1.9. 100% | 247.8 MiB/s | 2.5 MiB | 00m00s [ 82/172] Installing libfsverity-0:1.6- 100% | 28.8 MiB/s | 29.5 KiB | 00m00s [ 83/172] Installing libevent-0:2.1.12- 100% | 216.5 MiB/s | 886.8 KiB | 00m00s [ 84/172] Installing zstd-0:1.5.7-2.fc4 100% | 95.0 MiB/s | 1.7 MiB | 00m00s [ 85/172] Installing util-linux-core-0: 100% | 49.3 MiB/s | 1.5 MiB | 00m00s [ 86/172] Installing tar-2:1.35-6.fc43. 100% | 105.6 MiB/s | 3.0 MiB | 00m00s [ 87/172] Installing libsemanage-0:3.9- 100% | 101.0 MiB/s | 310.2 KiB | 00m00s [ 88/172] Installing systemd-standalone 100% | 20.5 MiB/s | 294.1 KiB | 00m00s [ 89/172] Installing rpm-libs-0:6.0.0-1 100% | 182.7 MiB/s | 935.2 KiB | 00m00s [ 90/172] Installing libusb1-0:1.0.29-4 100% | 7.0 MiB/s | 172.9 KiB | 00m00s >>> Running sysusers scriptlet: tpm2-tss-0:4.1.3-8.fc43.x86_64 >>> Finished sysusers scriptlet: tpm2-tss-0:4.1.3-8.fc43.x86_64 >>> Scriptlet output: >>> Creating group 'tss' with GID 59. >>> Creating user 'tss' (Account used for TPM access) with UID 59 and GID 59. >>> [ 91/172] Installing tpm2-tss-0:4.1.3-8 100% | 142.9 MiB/s | 1.6 MiB | 00m00s [ 92/172] Installing ima-evm-utils-libs 100% | 60.5 MiB/s | 62.0 KiB | 00m00s [ 93/172] Installing gnupg2-gpg-agent-0 100% | 16.9 MiB/s | 675.4 KiB | 00m00s [ 94/172] Installing zip-0:3.0-44.fc43. 100% | 37.9 MiB/s | 698.4 KiB | 00m00s [ 95/172] Installing gnupg2-keyboxd-0:2 100% | 9.9 MiB/s | 202.7 KiB | 00m00s [ 96/172] Installing libpsl-0:0.21.5-6. 100% | 37.9 MiB/s | 77.5 KiB | 00m00s [ 97/172] Installing liblastlog2-0:2.41 100% | 1.8 MiB/s | 35.9 KiB | 00m00s [ 98/172] Installing libfdisk-0:2.41.1- 100% | 124.2 MiB/s | 381.4 KiB | 00m00s [ 99/172] Installing nettle-0:3.10.1-2. 100% | 155.0 MiB/s | 793.7 KiB | 00m00s [100/172] Installing gnutls-0:3.8.10-3. 100% | 225.8 MiB/s | 3.8 MiB | 00m00s [101/172] Installing libxml2-0:2.12.10- 100% | 77.5 MiB/s | 1.7 MiB | 00m00s [102/172] Installing libarchive-0:3.8.1 100% | 155.1 MiB/s | 953.1 KiB | 00m00s [103/172] Installing bzip2-0:1.0.8-21.f 100% | 7.0 MiB/s | 99.8 KiB | 00m00s [104/172] Installing add-determinism-0: 100% | 106.2 MiB/s | 2.4 MiB | 00m00s [105/172] Installing build-reproducibil 100% | 1.0 MiB/s | 1.0 KiB | 00m00s [106/172] Installing cpio-0:2.15-6.fc43 100% | 52.4 MiB/s | 1.1 MiB | 00m00s [107/172] Installing diffutils-0:3.12-3 100% | 67.9 MiB/s | 1.6 MiB | 00m00s [108/172] Installing libpkgconf-0:2.3.0 100% | 77.4 MiB/s | 79.2 KiB | 00m00s [109/172] Installing pkgconf-0:2.3.0-3. 100% | 6.3 MiB/s | 91.0 KiB | 00m00s [110/172] Installing ed-0:1.22.2-1.fc43 100% | 10.5 MiB/s | 150.4 KiB | 00m00s [111/172] Installing patch-0:2.8-2.fc43 100% | 14.6 MiB/s | 224.3 KiB | 00m00s [112/172] Installing jansson-0:2.14-3.f 100% | 44.2 MiB/s | 90.5 KiB | 00m00s [113/172] Installing libgomp-0:15.2.1-2 100% | 176.6 MiB/s | 542.5 KiB | 00m00s [114/172] Installing libtool-ltdl-0:2.5 100% | 34.8 MiB/s | 71.2 KiB | 00m00s [115/172] Installing gdbm-libs-1:1.23-1 100% | 128.5 MiB/s | 131.6 KiB | 00m00s [116/172] Installing cyrus-sasl-lib-0:2 100% | 99.8 MiB/s | 2.3 MiB | 00m00s [117/172] Installing openldap-0:2.6.10- 100% | 129.6 MiB/s | 663.7 KiB | 00m00s [118/172] Installing gnupg2-dirmngr-0:2 100% | 17.8 MiB/s | 621.1 KiB | 00m00s [119/172] Installing gnupg2-0:2.4.8-4.f 100% | 152.4 MiB/s | 6.6 MiB | 00m00s [120/172] Installing rpm-sign-libs-0:6. 100% | 39.6 MiB/s | 40.6 KiB | 00m00s [121/172] Installing gpgverify-0:2.2-3. 100% | 9.2 MiB/s | 9.4 KiB | 00m00s [122/172] Installing xxhash-libs-0:0.8. 100% | 89.4 MiB/s | 91.6 KiB | 00m00s [123/172] Installing libbrotli-0:1.1.0- 100% | 204.0 MiB/s | 835.6 KiB | 00m00s [124/172] Installing libnghttp2-0:1.66. 100% | 159.5 MiB/s | 163.3 KiB | 00m00s [125/172] Installing keyutils-libs-0:1. 100% | 54.4 MiB/s | 55.7 KiB | 00m00s [126/172] Installing libcom_err-0:1.47. 100% | 62.7 MiB/s | 64.2 KiB | 00m00s [127/172] Installing libverto-0:0.3.2-1 100% | 26.6 MiB/s | 27.2 KiB | 00m00s [128/172] Installing krb5-libs-0:1.21.3 100% | 191.0 MiB/s | 2.3 MiB | 00m00s [129/172] Installing libssh-0:0.11.3-1. 100% | 139.0 MiB/s | 569.2 KiB | 00m00s [130/172] Installing libcurl-0:8.15.0-2 100% | 176.6 MiB/s | 904.3 KiB | 00m00s [131/172] Installing curl-0:8.15.0-2.fc 100% | 10.3 MiB/s | 476.3 KiB | 00m00s [132/172] Installing rpm-0:6.0.0-1.fc43 100% | 37.9 MiB/s | 2.6 MiB | 00m00s [133/172] Installing efi-srpm-macros-0: 100% | 40.2 MiB/s | 41.1 KiB | 00m00s [134/172] Installing java-srpm-macros-0 100% | 0.0 B/s | 1.1 KiB | 00m00s [135/172] Installing lua-srpm-macros-0: 100% | 0.0 B/s | 1.9 KiB | 00m00s [136/172] Installing tree-sitter-srpm-m 100% | 9.1 MiB/s | 9.3 KiB | 00m00s [137/172] Installing zig-srpm-macros-0: 100% | 0.0 B/s | 1.7 KiB | 00m00s [138/172] Installing filesystem-srpm-ma 100% | 38.0 MiB/s | 38.9 KiB | 00m00s [139/172] Installing elfutils-default-y 100% | 92.9 KiB/s | 2.0 KiB | 00m00s [140/172] Installing elfutils-libs-0:0. 100% | 133.8 MiB/s | 685.2 KiB | 00m00s [141/172] Installing elfutils-debuginfo 100% | 5.3 MiB/s | 86.2 KiB | 00m00s [142/172] Installing binutils-0:2.45-1. 100% | 186.9 MiB/s | 26.5 MiB | 00m00s [143/172] Installing elfutils-0:0.193-3 100% | 112.2 MiB/s | 2.9 MiB | 00m00s [144/172] Installing gdb-minimal-0:16.3 100% | 213.8 MiB/s | 13.3 MiB | 00m00s [145/172] Installing debugedit-0:5.2-3. 100% | 13.3 MiB/s | 217.3 KiB | 00m00s [146/172] Installing rpm-build-libs-0:6 100% | 131.5 MiB/s | 269.2 KiB | 00m00s [147/172] Installing pkgconf-m4-0:2.3.0 100% | 0.0 B/s | 14.8 KiB | 00m00s [148/172] Installing pkgconf-pkg-config 100% | 136.4 KiB/s | 1.8 KiB | 00m00s [149/172] Installing rust-srpm-macros-0 100% | 5.4 MiB/s | 5.6 KiB | 00m00s [150/172] Installing qt6-srpm-macros-0: 100% | 0.0 B/s | 740.0 B | 00m00s [151/172] Installing qt5-srpm-macros-0: 100% | 0.0 B/s | 776.0 B | 00m00s [152/172] Installing perl-srpm-macros-0 100% | 0.0 B/s | 1.1 KiB | 00m00s [153/172] Installing package-notes-srpm 100% | 0.0 B/s | 2.0 KiB | 00m00s [154/172] Installing openblas-srpm-macr 100% | 0.0 B/s | 392.0 B | 00m00s [155/172] Installing ocaml-srpm-macros- 100% | 2.1 MiB/s | 2.1 KiB | 00m00s [156/172] Installing kernel-srpm-macros 100% | 0.0 B/s | 2.3 KiB | 00m00s [157/172] Installing gnat-srpm-macros-0 100% | 0.0 B/s | 1.3 KiB | 00m00s [158/172] Installing ghc-srpm-macros-0: 100% | 0.0 B/s | 1.0 KiB | 00m00s [159/172] Installing gap-srpm-macros-0: 100% | 0.0 B/s | 2.7 KiB | 00m00s [160/172] Installing fpc-srpm-macros-0: 100% | 0.0 B/s | 420.0 B | 00m00s [161/172] Installing ansible-srpm-macro 100% | 35.4 MiB/s | 36.2 KiB | 00m00s [162/172] Installing rpm-build-0:6.0.0- 100% | 17.0 MiB/s | 296.5 KiB | 00m00s [163/172] Installing pyproject-srpm-mac 100% | 2.4 MiB/s | 2.5 KiB | 00m00s [164/172] Installing redhat-rpm-config- 100% | 46.2 MiB/s | 189.1 KiB | 00m00s [165/172] Installing forge-srpm-macros- 100% | 39.3 MiB/s | 40.3 KiB | 00m00s [166/172] Installing fonts-srpm-macros- 100% | 55.7 MiB/s | 57.0 KiB | 00m00s [167/172] Installing go-srpm-macros-0:3 100% | 61.6 MiB/s | 63.0 KiB | 00m00s [168/172] Installing python-srpm-macros 100% | 25.8 MiB/s | 52.8 KiB | 00m00s [169/172] Installing util-linux-0:2.41. 100% | 55.0 MiB/s | 3.6 MiB | 00m00s [170/172] Installing shadow-utils-2:4.1 100% | 79.4 MiB/s | 4.0 MiB | 00m00s [171/172] Installing which-0:2.23-3.fc4 100% | 6.0 MiB/s | 85.7 KiB | 00m00s [172/172] Installing info-0:7.2-6.fc43. 100% | 117.1 KiB/s | 354.3 KiB | 00m03s Warning: skipped OpenPGP checks for 170 packages from repository: https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 Complete! Finish: installing minimal buildroot with dnf5 Start: creating root cache Finish: creating root cache Finish: chroot init INFO: Installed packages: INFO: add-determinism-0.6.0-2.fc43.x86_64 alternatives-1.33-2.fc43.x86_64 ansible-srpm-macros-1-18.1.fc43.noarch audit-libs-4.1.1-2.fc43.x86_64 bash-5.3.0-2.fc43.x86_64 binutils-2.45-1.fc43.x86_64 build-reproducibility-srpm-macros-0.6.0-2.fc43.noarch bzip2-1.0.8-21.fc43.x86_64 bzip2-libs-1.0.8-21.fc43.x86_64 ca-certificates-2025.2.80_v9.0.304-1.1.fc43.noarch coreutils-9.7-6.fc43.x86_64 coreutils-common-9.7-6.fc43.x86_64 cpio-2.15-6.fc43.x86_64 crypto-policies-20250714-5.gitcd6043a.fc43.noarch curl-8.15.0-2.fc43.x86_64 cyrus-sasl-lib-2.1.28-33.fc43.x86_64 debugedit-5.2-3.fc43.x86_64 diffutils-3.12-3.fc43.x86_64 dwz-0.16-2.fc43.x86_64 ed-1.22.2-1.fc43.x86_64 efi-srpm-macros-6-4.fc43.noarch elfutils-0.193-3.fc43.x86_64 elfutils-debuginfod-client-0.193-3.fc43.x86_64 elfutils-default-yama-scope-0.193-3.fc43.noarch elfutils-libelf-0.193-3.fc43.x86_64 elfutils-libs-0.193-3.fc43.x86_64 fedora-gpg-keys-43-0.4.noarch fedora-release-43-0.23.noarch fedora-release-common-43-0.23.noarch fedora-release-identity-basic-43-0.23.noarch fedora-repos-43-0.4.noarch file-5.46-8.fc43.x86_64 file-libs-5.46-8.fc43.x86_64 filesystem-3.18-50.fc43.x86_64 filesystem-srpm-macros-3.18-50.fc43.noarch findutils-4.10.0-6.fc43.x86_64 fonts-srpm-macros-2.0.5-23.fc43.noarch forge-srpm-macros-0.4.0-3.fc43.noarch fpc-srpm-macros-1.3-15.fc43.noarch gap-srpm-macros-2-1.fc43.noarch gawk-5.3.2-2.fc43.x86_64 gdb-minimal-16.3-6.fc43.x86_64 gdbm-libs-1.23-10.fc43.x86_64 ghc-srpm-macros-1.9.2-3.fc43.noarch glibc-2.42-4.fc43.x86_64 glibc-common-2.42-4.fc43.x86_64 glibc-gconv-extra-2.42-4.fc43.x86_64 glibc-minimal-langpack-2.42-4.fc43.x86_64 gmp-6.3.0-4.fc43.x86_64 gnat-srpm-macros-6-8.fc43.noarch gnulib-l10n-20241231-1.fc43.noarch gnupg2-2.4.8-4.fc43.x86_64 gnupg2-dirmngr-2.4.8-4.fc43.x86_64 gnupg2-gpg-agent-2.4.8-4.fc43.x86_64 gnupg2-gpgconf-2.4.8-4.fc43.x86_64 gnupg2-keyboxd-2.4.8-4.fc43.x86_64 gnupg2-verify-2.4.8-4.fc43.x86_64 gnutls-3.8.10-3.fc43.x86_64 go-srpm-macros-3.8.0-1.fc43.noarch gpgverify-2.2-3.fc43.noarch grep-3.12-2.fc43.x86_64 gzip-1.13-4.fc43.x86_64 ima-evm-utils-libs-1.6.2-6.fc43.x86_64 info-7.2-6.fc43.x86_64 jansson-2.14-3.fc43.x86_64 java-srpm-macros-1-7.fc43.noarch json-c-0.18-7.fc43.x86_64 kernel-srpm-macros-1.0-27.fc43.noarch keyutils-libs-1.6.3-6.fc43.x86_64 krb5-libs-1.21.3-7.fc43.x86_64 libacl-2.3.2-4.fc43.x86_64 libarchive-3.8.1-3.fc43.x86_64 libassuan-2.5.7-4.fc43.x86_64 libattr-2.5.2-6.fc43.x86_64 libblkid-2.41.1-17.fc43.x86_64 libbrotli-1.1.0-10.fc43.x86_64 libcap-2.76-3.fc43.x86_64 libcap-ng-0.8.5-8.fc43.x86_64 libcom_err-1.47.3-2.fc43.x86_64 libcurl-8.15.0-2.fc43.x86_64 libeconf-0.7.9-2.fc43.x86_64 libevent-2.1.12-16.fc43.x86_64 libfdisk-2.41.1-17.fc43.x86_64 libffi-3.5.1-2.fc43.x86_64 libfsverity-1.6-3.fc43.x86_64 libgcc-15.2.1-2.fc43.x86_64 libgcrypt-1.11.1-2.fc43.x86_64 libgomp-15.2.1-2.fc43.x86_64 libgpg-error-1.55-2.fc43.x86_64 libidn2-2.3.8-2.fc43.x86_64 libksba-1.6.7-4.fc43.x86_64 liblastlog2-2.41.1-17.fc43.x86_64 libmount-2.41.1-17.fc43.x86_64 libnghttp2-1.66.0-2.fc43.x86_64 libpkgconf-2.3.0-3.fc43.x86_64 libpsl-0.21.5-6.fc43.x86_64 libselinux-3.9-5.fc43.x86_64 libsemanage-3.9-4.fc43.x86_64 libsepol-3.9-2.fc43.x86_64 libsmartcols-2.41.1-17.fc43.x86_64 libssh-0.11.3-1.fc43.x86_64 libssh-config-0.11.3-1.fc43.noarch libstdc++-15.2.1-2.fc43.x86_64 libtasn1-4.20.0-2.fc43.x86_64 libtool-ltdl-2.5.4-7.fc43.x86_64 libunistring-1.1-10.fc43.x86_64 libusb1-1.0.29-4.fc43.x86_64 libuuid-2.41.1-17.fc43.x86_64 libverto-0.3.2-11.fc43.x86_64 libxcrypt-4.4.38-8.fc43.x86_64 libxml2-2.12.10-5.fc43.x86_64 libzstd-1.5.7-2.fc43.x86_64 lua-libs-5.4.8-2.fc43.x86_64 lua-srpm-macros-1-16.fc43.noarch lz4-libs-1.10.0-3.fc43.x86_64 mpfr-4.2.2-2.fc43.x86_64 ncurses-base-6.5-7.20250614.fc43.noarch ncurses-libs-6.5-7.20250614.fc43.x86_64 nettle-3.10.1-2.fc43.x86_64 npth-1.8-3.fc43.x86_64 ocaml-srpm-macros-11-2.fc43.noarch openblas-srpm-macros-2-20.fc43.noarch openldap-2.6.10-4.fc43.x86_64 openssl-libs-3.5.1-2.fc43.x86_64 p11-kit-0.25.8-1.fc43.x86_64 p11-kit-trust-0.25.8-1.fc43.x86_64 package-notes-srpm-macros-0.5-14.fc43.noarch pam-libs-1.7.1-3.fc43.x86_64 patch-2.8-2.fc43.x86_64 pcre2-10.46-1.fc43.x86_64 pcre2-syntax-10.46-1.fc43.noarch perl-srpm-macros-1-60.fc43.noarch pkgconf-2.3.0-3.fc43.x86_64 pkgconf-m4-2.3.0-3.fc43.noarch pkgconf-pkg-config-2.3.0-3.fc43.x86_64 popt-1.19-9.fc43.x86_64 publicsuffix-list-dafsa-20250616-2.fc43.noarch pyproject-srpm-macros-1.18.4-1.fc43.noarch python-srpm-macros-3.14-5.fc43.noarch qt5-srpm-macros-5.15.17-2.fc43.noarch qt6-srpm-macros-6.9.2-1.fc43.noarch readline-8.3-2.fc43.x86_64 redhat-rpm-config-343-11.fc43.noarch rpm-6.0.0-1.fc43.x86_64 rpm-build-6.0.0-1.fc43.x86_64 rpm-build-libs-6.0.0-1.fc43.x86_64 rpm-libs-6.0.0-1.fc43.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 rpm-sign-libs-6.0.0-1.fc43.x86_64 rust-srpm-macros-26.4-1.fc43.noarch sed-4.9-5.fc43.x86_64 setup-2.15.0-26.fc43.noarch shadow-utils-4.18.0-3.fc43.x86_64 sqlite-libs-3.50.2-2.fc43.x86_64 systemd-libs-258-1.fc43.x86_64 systemd-standalone-sysusers-258-1.fc43.x86_64 tar-1.35-6.fc43.x86_64 tpm2-tss-4.1.3-8.fc43.x86_64 tree-sitter-srpm-macros-0.4.2-1.fc43.noarch unzip-6.0-67.fc43.x86_64 util-linux-2.41.1-17.fc43.x86_64 util-linux-core-2.41.1-17.fc43.x86_64 which-2.23-3.fc43.x86_64 xxhash-libs-0.8.3-3.fc43.x86_64 xz-5.8.1-2.fc43.x86_64 xz-libs-5.8.1-2.fc43.x86_64 zig-srpm-macros-1-5.fc43.noarch zip-3.0-44.fc43.x86_64 zlib-ng-compat-2.2.5-2.fc43.x86_64 zstd-1.5.7-2.fc43.x86_64 Start: buildsrpm Start: rpmbuild -bs Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1755475200 Wrote: /builddir/build/SRPMS/rccl-6.4.2-5.fc43.src.rpm Finish: rpmbuild -bs INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-43-x86_64-1759943729.176052/root/var/log/dnf5.log INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz /bin/tar: Removing leading `/' from member names Finish: buildsrpm INFO: Done(/var/lib/copr-rpmbuild/workspace/workdir-nttra331/rccl/rccl.spec) Config(child) 0 minutes 53 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot INFO: Start(/var/lib/copr-rpmbuild/results/rccl-6.4.2-5.fc43.src.rpm) Config(fedora-43-x86_64) Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1759943729.176052/root. INFO: reusing tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1759943729.176052/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-1759943729.176052/root. INFO: calling preinit hooks INFO: enabled root cache Start: unpacking root cache Finish: unpacking root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-6.0.0-1.fc43.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 dnf5-5.2.17.0-2.fc43.x86_64 dnf5-plugins-5.2.17.0-2.fc43.x86_64 Finish: chroot init Start: build phase for rccl-6.4.2-5.fc43.src.rpm Start: build setup for rccl-6.4.2-5.fc43.src.rpm Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1755475200 Wrote: /builddir/build/SRPMS/rccl-6.4.2-5.fc43.src.rpm Updating and loading repositories: Additional repo https_kojipkgs_fedorap 100% | 51.4 KiB/s | 3.5 KiB | 00m00s Copr repository 100% | 22.4 KiB/s | 1.5 KiB | 00m00s fedora 100% | 46.4 KiB/s | 6.0 KiB | 00m00s updates 100% | 320.9 KiB/s | 30.2 KiB | 00m00s Repositories loaded. Package Arch Version Repository Size Installing: cmake x86_64 3.31.6-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 34.5 MiB gcc-c++ x86_64 15.2.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 41.4 MiB gtest-devel x86_64 1.15.2-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.0 MiB hipify x86_64 6.4.1-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.1 MiB rocm-cmake noarch 6.4.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 130.5 KiB rocm-comgr-devel x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 98.2 KiB rocm-core-devel x86_64 6.4.4-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 14.8 KiB rocm-hip-devel x86_64 6.4.2-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.8 MiB rocm-rpm-macros noarch 6.4.2-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 18.9 KiB rocm-runtime-devel x86_64 6.4.2-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 571.4 KiB rocm-smi-devel x86_64 6.4.3-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 281.8 KiB Installing dependencies: annobin-docs noarch 12.99-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 98.9 KiB annobin-plugin-gcc x86_64 12.99-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.0 MiB cmake-data noarch 3.31.6-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 8.5 MiB cmake-filesystem x86_64 3.31.6-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 0.0 B cmake-rpm-macros noarch 3.31.6-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 7.7 KiB cpp x86_64 15.2.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 37.9 MiB emacs-filesystem noarch 1:30.0-5.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 0.0 B environment-modules x86_64 5.6.0-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.9 MiB expat x86_64 2.7.2-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 298.6 KiB gcc x86_64 15.2.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 111.9 MiB gcc-plugin-annobin x86_64 15.2.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 57.2 KiB git x86_64 2.51.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 56.4 KiB git-core x86_64 2.51.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 23.6 MiB git-core-doc noarch 2.51.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 17.7 MiB glibc-devel x86_64 2.42-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.3 MiB gmock x86_64 1.15.2-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 131.0 KiB groff-base x86_64 1.23.0-10.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.8 MiB gtest x86_64 1.15.2-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 501.8 KiB hipcc x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 652.9 KiB hwdata noarch 0.399-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 9.6 MiB jsoncpp x86_64 1.9.6-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 257.6 KiB kernel-headers x86_64 6.17.0-63.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 6.7 MiB less x86_64 679-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 406.1 KiB libcbor x86_64 0.12.0-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 77.8 KiB libdb x86_64 5.3.28-66.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.9 MiB libdrm x86_64 2.4.125-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 395.8 KiB libdrm-devel x86_64 2.4.125-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 728.8 KiB libedit x86_64 3.1-56.20250104cvs.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 240.1 KiB libfido2 x86_64 1.16.0-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 238.5 KiB libmpc x86_64 1.3.1-8.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 160.6 KiB libpciaccess x86_64 0.16-16.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 44.5 KiB libpciaccess-devel x86_64 0.16-16.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 15.3 KiB libpipeline x86_64 1.5.8-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 145.1 KiB libstdc++-devel x86_64 15.2.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 37.3 MiB libtommath x86_64 1.3.1~rc1-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 126.4 KiB libuv x86_64 1:1.51.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 570.2 KiB libxcrypt-devel x86_64 4.4.38-8.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 30.8 KiB make x86_64 1:4.4.1-11.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.8 MiB man-db x86_64 2.13.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.9 MiB mpdecimal x86_64 4.0.1-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 217.2 KiB ncurses x86_64 6.5-7.20250614.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 609.8 KiB numactl-libs x86_64 2.0.19-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 56.9 KiB openssh x86_64 10.0p1-5.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.4 MiB openssh-clients x86_64 10.0p1-5.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.6 MiB perl x86_64 4:5.42.0-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 0.0 B perl-Algorithm-Diff noarch 1.2010-14.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 107.5 KiB perl-Archive-Tar noarch 3.04-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 154.4 KiB perl-Archive-Zip noarch 1.68-17.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 291.1 KiB perl-Attribute-Handlers noarch 1.03-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 39.9 KiB perl-AutoLoader noarch 5.74-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 20.6 KiB perl-AutoSplit noarch 5.74-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 23.1 KiB perl-B x86_64 1.89-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 501.3 KiB perl-Benchmark noarch 1.27-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 36.4 KiB perl-CPAN noarch 2.38-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.9 MiB perl-CPAN-Meta noarch 2.150010-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 592.2 KiB perl-CPAN-Meta-Requirements noarch 2.143-13.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 81.2 KiB perl-CPAN-Meta-YAML noarch 0.020-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 52.1 KiB perl-Carp noarch 1.54-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 46.6 KiB perl-Class-Struct noarch 0.68-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 25.4 KiB perl-Compress-Bzip2 x86_64 2.28-24.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 142.6 KiB perl-Compress-Raw-Bzip2 x86_64 2.213-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 67.3 KiB perl-Compress-Raw-Lzma x86_64 2.213-7.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 120.9 KiB perl-Compress-Raw-Zlib x86_64 2.213-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 163.2 KiB perl-Config-Extensions noarch 0.03-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.7 KiB perl-Config-Perl-V noarch 0.38-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 25.9 KiB perl-DBM_Filter noarch 0.07-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 28.7 KiB perl-DB_File x86_64 1.859-516.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 188.8 KiB perl-Data-Dumper x86_64 2.191-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 115.6 KiB perl-Data-OptList noarch 0.114-7.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 50.1 KiB perl-Data-Section noarch 0.200008-8.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 42.7 KiB perl-Devel-PPPort x86_64 3.73-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 889.8 KiB perl-Devel-Peek x86_64 1.36-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 43.5 KiB perl-Devel-SelfStubber noarch 1.06-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 6.8 KiB perl-Devel-Size x86_64 0.85-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 42.1 KiB perl-Digest noarch 1.20-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 35.3 KiB perl-Digest-MD5 x86_64 2.59-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 59.7 KiB perl-Digest-SHA x86_64 1:6.04-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 112.5 KiB perl-DirHandle noarch 1.05-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.4 KiB perl-Dumpvalue noarch 2.27-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 19.8 KiB perl-DynaLoader x86_64 1.57-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 32.1 KiB perl-Encode x86_64 4:3.21-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 4.7 MiB perl-Encode-devel x86_64 4:3.21-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 99.6 KiB perl-English noarch 1.11-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 6.2 KiB perl-Env noarch 1.06-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 26.1 KiB perl-Errno x86_64 1.38-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 8.4 KiB perl-Error noarch 1:0.17030-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 76.7 KiB perl-Exporter noarch 5.79-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 54.3 KiB perl-ExtUtils-CBuilder noarch 1:0.280242-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 97.3 KiB perl-ExtUtils-Command noarch 2:7.76-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 9.6 KiB perl-ExtUtils-Constant noarch 0.25-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 85.9 KiB perl-ExtUtils-Embed noarch 1.35-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 15.6 KiB perl-ExtUtils-Install noarch 2.22-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 85.5 KiB perl-ExtUtils-MM-Utils noarch 2:7.76-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.9 KiB perl-ExtUtils-MakeMaker noarch 2:7.76-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 739.7 KiB perl-ExtUtils-Manifest noarch 1:1.75-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 84.8 KiB perl-ExtUtils-Miniperl noarch 1.14-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 8.3 KiB perl-ExtUtils-ParseXS noarch 1:3.58-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 539.6 KiB perl-Fcntl x86_64 1.20-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 48.8 KiB perl-File-Basename noarch 2.86-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 14.0 KiB perl-File-Compare noarch 1.100.800-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 5.6 KiB perl-File-Copy noarch 2.41-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 19.7 KiB perl-File-DosGlob x86_64 1.12-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 20.8 KiB perl-File-Fetch noarch 1.08-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 60.3 KiB perl-File-Find noarch 1.44-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 42.0 KiB perl-File-HomeDir noarch 1.006-15.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 119.3 KiB perl-File-Path noarch 2.18-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 63.5 KiB perl-File-Temp noarch 1:0.231.100-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 162.3 KiB perl-File-Which noarch 1.27-14.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 30.4 KiB perl-File-stat noarch 1.14-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 12.5 KiB perl-FileCache noarch 1.10-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 7.5 KiB perl-FileHandle noarch 2.05-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 9.4 KiB perl-Filter x86_64 2:1.64-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 156.7 KiB perl-Filter-Simple noarch 0.96-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 50.7 KiB perl-FindBin noarch 1.54-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 6.8 KiB perl-GDBM_File x86_64 1:1.24-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 79.6 KiB perl-Getopt-Long noarch 1:2.58-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 144.5 KiB perl-Getopt-Std noarch 1.14-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 11.2 KiB perl-Git noarch 2.51.0-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 64.4 KiB perl-HTTP-Tiny noarch 0.090-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 154.4 KiB perl-Hash-Util x86_64 0.32-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 55.0 KiB perl-Hash-Util-FieldHash x86_64 1.27-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 62.6 KiB perl-I18N-Collate noarch 1.02-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 7.1 KiB perl-I18N-LangTags noarch 0.45-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 82.4 KiB perl-I18N-Langinfo x86_64 0.24-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 34.7 KiB perl-IO x86_64 1.55-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 147.4 KiB perl-IO-Compress noarch 2.213-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.0 MiB perl-IO-Compress-Lzma noarch 2.213-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 215.2 KiB perl-IO-Socket-IP noarch 0.43-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 100.3 KiB perl-IO-Socket-SSL noarch 2.095-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 714.5 KiB perl-IO-Zlib noarch 1:1.15-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 25.7 KiB perl-IPC-Cmd noarch 2:1.04-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 84.9 KiB perl-IPC-Open3 noarch 1.24-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 27.7 KiB perl-IPC-SysV x86_64 2.09-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 73.7 KiB perl-IPC-System-Simple noarch 1.30-16.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 71.7 KiB perl-JSON-PP noarch 1:4.16-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 141.8 KiB perl-Locale-Maketext noarch 1.33-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 171.3 KiB perl-Locale-Maketext-Simple noarch 1:0.21-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 12.8 KiB perl-MIME-Base32 noarch 1.303-24.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 30.7 KiB perl-MIME-Base64 x86_64 3.16-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 42.0 KiB perl-MRO-Compat noarch 0.15-12.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 43.0 KiB perl-Math-BigInt noarch 1:2.0050.03-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.1 MiB perl-Math-BigInt-FastCalc x86_64 0.502.000-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 44.0 KiB perl-Math-Complex noarch 1.63-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 85.1 KiB perl-Memoize noarch 1.17-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 64.7 KiB perl-Module-Build noarch 2:0.42.34-9.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 654.2 KiB perl-Module-CoreList noarch 1:5.20250923-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.2 MiB perl-Module-CoreList-tools noarch 1:5.20250923-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 18.6 KiB perl-Module-Load noarch 1:0.36-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 14.9 KiB perl-Module-Load-Conditional noarch 0.74-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 28.7 KiB perl-Module-Loaded noarch 1:0.08-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 5.0 KiB perl-Module-Metadata noarch 1.000038-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 67.5 KiB perl-Module-Signature noarch 0.93-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 136.5 KiB perl-NDBM_File x86_64 1.18-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 28.5 KiB perl-NEXT noarch 0.69-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 23.6 KiB perl-Net noarch 1.04-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 22.4 KiB perl-Net-Ping noarch 2.76-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 134.2 KiB perl-Net-SSLeay x86_64 1.94-11.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.3 MiB perl-ODBM_File x86_64 1.20-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 28.5 KiB perl-Opcode x86_64 1.69-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 48.6 KiB perl-POSIX x86_64 2.23-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 231.4 KiB perl-Package-Generator noarch 1.106-34.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 29.9 KiB perl-Params-Check noarch 1:0.38-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 27.6 KiB perl-Params-Util x86_64 1.102-19.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 58.5 KiB perl-PathTools x86_64 3.94-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 180.0 KiB perl-Perl-OSType noarch 1.010-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 32.8 KiB perl-PerlIO-via-QuotedPrint noarch 0.10-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 30.2 KiB perl-Pod-Checker noarch 4:1.77-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 52.2 KiB perl-Pod-Escapes noarch 1:1.07-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 24.9 KiB perl-Pod-Functions noarch 1.14-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 14.4 KiB perl-Pod-Html noarch 1.35-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 42.3 KiB perl-Pod-Perldoc noarch 3.28.01-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 163.7 KiB perl-Pod-Simple noarch 1:3.47-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 565.3 KiB perl-Pod-Usage noarch 4:2.05-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 86.3 KiB perl-Safe noarch 2.47-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 30.7 KiB perl-Scalar-List-Utils x86_64 5:1.70-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 144.9 KiB perl-Search-Dict noarch 1.08-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 4.7 KiB perl-SelectSaver noarch 1.02-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.2 KiB perl-SelfLoader noarch 1.28-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 22.2 KiB perl-Socket x86_64 4:2.040-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 120.3 KiB perl-Software-License noarch 0.104007-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 500.7 KiB perl-Storable x86_64 1:3.37-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 231.2 KiB perl-Sub-Exporter noarch 0.991-6.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 194.9 KiB perl-Sub-Install noarch 0.929-8.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 35.9 KiB perl-Symbol noarch 1.09-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 6.8 KiB perl-Sys-Hostname x86_64 1.25-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 15.8 KiB perl-Sys-Syslog x86_64 0.36-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 94.7 KiB perl-Term-ANSIColor noarch 5.01-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 97.5 KiB perl-Term-Cap noarch 1.18-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 29.3 KiB perl-Term-Complete noarch 1.403-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 5.8 KiB perl-Term-ReadLine noarch 1.17-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 17.3 KiB perl-Term-Table noarch 0.025-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 78.0 KiB perl-TermReadKey x86_64 2.38-26.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 64.0 KiB perl-Test noarch 1.31-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 37.0 KiB perl-Test-Harness noarch 1:3.52-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 560.8 KiB perl-Test-Simple noarch 3:1.302214-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.7 MiB perl-Text-Abbrev noarch 1.02-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.1 KiB perl-Text-Balanced noarch 2.07-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 111.5 KiB perl-Text-Diff noarch 1.45-24.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 83.0 KiB perl-Text-Glob noarch 0.11-26.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 8.4 KiB perl-Text-ParseWords noarch 3.31-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 13.6 KiB perl-Text-Tabs+Wrap noarch 2024.001-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 22.6 KiB perl-Text-Template noarch 1.61-8.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 112.4 KiB perl-Thread noarch 3.06-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 12.1 KiB perl-Thread-Queue noarch 3.14-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 28.9 KiB perl-Thread-Semaphore noarch 2.13-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 10.0 KiB perl-Tie noarch 4.6-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 32.1 KiB perl-Tie-File noarch 1.10-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 85.6 KiB perl-Tie-Memoize noarch 1.1-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 6.2 KiB perl-Tie-RefHash noarch 1.41-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 35.9 KiB perl-Time noarch 1.04-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 9.8 KiB perl-Time-HiRes x86_64 4:1.9778-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 115.8 KiB perl-Time-Local noarch 2:1.350-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 69.0 KiB perl-Time-Piece x86_64 1.3600-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 71.2 KiB perl-URI noarch 5.34-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 268.0 KiB perl-Unicode-Collate x86_64 1.31-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 4.2 MiB perl-Unicode-Normalize x86_64 1.32-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 486.2 KiB perl-Unicode-UCD noarch 0.81-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 206.4 KiB perl-User-pwent noarch 1.05-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 17.1 KiB perl-autodie noarch 2.37-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 214.9 KiB perl-autouse noarch 1.11-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 5.9 KiB perl-base noarch 2.27-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 12.6 KiB perl-bignum noarch 0.67-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 133.1 KiB perl-blib noarch 1.07-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.2 KiB perl-constant noarch 1.33-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 26.2 KiB perl-debugger noarch 1.60-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 403.2 KiB perl-deprecate noarch 0.04-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 6.6 KiB perl-devel x86_64 4:5.42.0-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.8 MiB perl-diagnostics noarch 1.40-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 471.0 KiB perl-doc noarch 5.42.0-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 11.5 MiB perl-encoding x86_64 4:3.00-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 149.5 KiB perl-encoding-warnings noarch 0.14-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 10.1 KiB perl-experimental noarch 0.036-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 43.4 KiB perl-fields noarch 2.27-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 11.9 KiB perl-filetest noarch 1.03-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 6.4 KiB perl-if noarch 0.61.000-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 5.8 KiB perl-inc-latest noarch 2:0.500-31.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 34.6 KiB perl-interpreter x86_64 4:5.42.0-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 118.6 KiB perl-less noarch 0.03-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 4.9 KiB perl-lib x86_64 0.65-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 8.5 KiB perl-libnet noarch 3.15-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 289.4 KiB perl-libnetcfg noarch 4:5.42.0-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 16.9 KiB perl-libs x86_64 4:5.42.0-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 11.5 MiB perl-local-lib noarch 2.000029-10.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 117.6 KiB perl-locale noarch 1.13-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 6.1 KiB perl-macros noarch 4:5.42.0-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 5.5 KiB perl-meta-notation noarch 5.42.0-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.0 KiB perl-mro x86_64 1.29-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 41.6 KiB perl-open noarch 1.13-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 11.3 KiB perl-overload noarch 1.40-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 71.6 KiB perl-overloading noarch 0.02-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 4.9 KiB perl-parent noarch 1:0.244-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 10.3 KiB perl-perlfaq noarch 5.20250619-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 733.6 KiB perl-ph x86_64 5.42.0-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 276.4 KiB perl-podlators noarch 1:6.0.2-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 317.5 KiB perl-sigtrap noarch 1.10-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 11.1 KiB perl-sort noarch 2.06-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 4.8 KiB perl-subs noarch 1.04-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.1 KiB perl-threads x86_64 1:2.43-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 115.1 KiB perl-threads-shared x86_64 1.70-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 83.6 KiB perl-utils noarch 5.42.0-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 97.0 KiB perl-vars noarch 1.05-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.9 KiB perl-version x86_64 9:0.99.33-521.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 128.7 KiB perl-vmsish noarch 1.04-520.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 6.6 KiB procps-ng x86_64 4.0.4-7.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.0 MiB python-pip-wheel noarch 25.1.1-18.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.2 MiB python3 x86_64 3.14.0~rc3-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 28.9 KiB python3-libs x86_64 3.14.0~rc3-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 43.0 MiB python3-pyparsing noarch 3.1.2-14.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.0 MiB rhash x86_64 1.4.5-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 351.1 KiB rocm-clang x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 70.2 MiB rocm-clang-devel x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 23.3 MiB rocm-clang-libs x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 98.4 MiB rocm-clang-runtime-devel x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 7.8 MiB rocm-comgr x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 123.9 MiB rocm-core x86_64 6.4.4-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 12.3 KiB rocm-device-libs x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.2 MiB rocm-hip x86_64 6.4.2-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 24.9 MiB rocm-libc++ x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.2 MiB rocm-libc++-devel x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 7.5 MiB rocm-lld x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 5.7 MiB rocm-llvm x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 48.5 MiB rocm-llvm-devel x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 25.3 MiB rocm-llvm-filesystem x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 0.0 B rocm-llvm-libs x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 84.8 MiB rocm-llvm-static x86_64 19-14.rocm6.4.2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.8 GiB rocm-runtime x86_64 6.4.2-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 3.1 MiB rocm-smi x86_64 6.4.3-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 2.7 MiB systemtap-sdt-devel x86_64 5.3-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 182.9 KiB systemtap-sdt-dtrace x86_64 5.3-4.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 179.6 KiB tcl x86_64 1:9.0.2-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 4.3 MiB tzdata noarch 2025b-3.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 1.6 MiB vim-filesystem noarch 2:9.1.1818-1.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 40.0 B zlib-ng-compat-devel x86_64 2.2.5-2.fc43 https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 107.0 KiB Transaction Summary: Installing: 304 packages Total size of inbound packages is 528 MiB. Need to download 528 MiB. After this operation, 3 GiB extra will be used (install 3 GiB, remove 0 B). [ 1/304] gtest-devel-0:1.15.2-4.fc43.x 100% | 1.6 MiB/s | 243.4 KiB | 00m00s [ 2/304] hipify-0:6.4.1-3.fc43.x86_64 100% | 7.2 MiB/s | 505.2 KiB | 00m00s [ 3/304] rocm-cmake-0:6.4.0-2.fc43.noa 100% | 1.7 MiB/s | 37.2 KiB | 00m00s [ 4/304] rocm-comgr-devel-0:19-14.rocm 100% | 1.3 MiB/s | 31.8 KiB | 00m00s [ 5/304] rocm-core-devel-0:6.4.4-1.fc4 100% | 426.0 KiB/s | 13.2 KiB | 00m00s [ 6/304] cmake-0:3.31.6-4.fc43.x86_64 100% | 25.2 MiB/s | 12.2 MiB | 00m00s [ 7/304] rocm-rpm-macros-0:6.4.2-1.fc4 100% | 574.6 KiB/s | 16.1 KiB | 00m00s [ 8/304] gcc-c++-0:15.2.1-2.fc43.x86_6 100% | 28.0 MiB/s | 15.2 MiB | 00m01s [ 9/304] rocm-runtime-devel-0:6.4.2-2. 100% | 3.3 MiB/s | 93.4 KiB | 00m00s [ 10/304] rocm-core-0:6.4.4-1.fc43.x86_ 100% | 533.8 KiB/s | 13.3 KiB | 00m00s [ 11/304] rocm-smi-devel-0:6.4.3-1.fc43 100% | 2.1 MiB/s | 57.0 KiB | 00m00s [ 12/304] rocm-hip-devel-0:6.4.2-2.fc43 100% | 799.1 KiB/s | 233.3 KiB | 00m00s [ 13/304] cmake-filesystem-0:3.31.6-4.f 100% | 737.6 KiB/s | 15.5 KiB | 00m00s [ 14/304] libdrm-devel-0:2.4.125-2.fc43 100% | 6.3 MiB/s | 174.3 KiB | 00m00s [ 15/304] python3-0:3.14.0~rc3-1.fc43.x 100% | 1.3 MiB/s | 27.6 KiB | 00m00s [ 16/304] rocm-runtime-0:6.4.2-2.fc43.x 100% | 21.9 MiB/s | 649.5 KiB | 00m00s [ 17/304] rocm-smi-0:6.4.3-1.fc43.x86_6 100% | 12.3 MiB/s | 605.5 KiB | 00m00s [ 18/304] libdrm-0:2.4.125-2.fc43.x86_6 100% | 6.6 MiB/s | 161.3 KiB | 00m00s [ 19/304] numactl-libs-0:2.0.19-3.fc43. 100% | 1.5 MiB/s | 31.1 KiB | 00m00s [ 20/304] perl-File-Basename-0:2.86-520 100% | 858.4 KiB/s | 17.2 KiB | 00m00s [ 21/304] perl-File-Copy-0:2.41-520.fc4 100% | 1.0 MiB/s | 20.1 KiB | 00m00s [ 22/304] perl-File-Which-0:1.27-14.fc4 100% | 1.0 MiB/s | 21.4 KiB | 00m00s [ 23/304] perl-Getopt-Std-0:1.14-520.fc 100% | 785.3 KiB/s | 15.7 KiB | 00m00s [ 24/304] environment-modules-0:5.6.0-1 100% | 13.2 MiB/s | 785.0 KiB | 00m00s [ 25/304] perl-PathTools-0:3.94-520.fc4 100% | 3.9 MiB/s | 83.2 KiB | 00m00s [ 26/304] perl-Scalar-List-Utils-5:1.70 100% | 3.3 MiB/s | 75.0 KiB | 00m00s [ 27/304] perl-URI-0:5.34-1.fc43.noarch 100% | 6.2 MiB/s | 134.2 KiB | 00m00s [ 28/304] perl-interpreter-4:5.42.0-520 100% | 3.4 MiB/s | 72.4 KiB | 00m00s [ 29/304] hipcc-0:19-14.rocm6.4.2.fc43. 100% | 4.8 MiB/s | 133.6 KiB | 00m00s [ 30/304] perl-libs-4:5.42.0-520.fc43.x 100% | 37.2 MiB/s | 2.5 MiB | 00m00s [ 31/304] rocm-device-libs-0:19-14.rocm 100% | 16.0 MiB/s | 491.0 KiB | 00m00s [ 32/304] python3-libs-0:3.14.0~rc3-1.f 100% | 32.4 MiB/s | 9.5 MiB | 00m00s [ 33/304] perl-Carp-0:1.54-520.fc43.noa 100% | 1.2 MiB/s | 28.7 KiB | 00m00s [ 34/304] perl-DynaLoader-0:1.57-520.fc 100% | 1.3 MiB/s | 26.0 KiB | 00m00s [ 35/304] perl-Encode-4:3.21-520.fc43.x 100% | 22.7 MiB/s | 1.0 MiB | 00m00s [ 36/304] perl-Exporter-0:5.79-520.fc43 100% | 1.4 MiB/s | 30.9 KiB | 00m00s [ 37/304] perl-Data-Dumper-0:2.191-521. 100% | 2.6 MiB/s | 56.3 KiB | 00m00s [ 38/304] perl-MIME-Base32-0:1.303-24.f 100% | 1.0 MiB/s | 20.4 KiB | 00m00s [ 39/304] perl-MIME-Base64-0:3.16-520.f 100% | 1.5 MiB/s | 29.7 KiB | 00m00s [ 40/304] perl-base-0:2.27-520.fc43.noa 100% | 811.2 KiB/s | 16.2 KiB | 00m00s [ 41/304] rocm-hip-0:6.4.2-2.fc43.x86_6 100% | 15.4 MiB/s | 9.5 MiB | 00m01s [ 42/304] perl-constant-0:1.33-521.fc43 100% | 843.4 KiB/s | 22.8 KiB | 00m00s [ 43/304] perl-libnet-0:3.15-521.fc43.n 100% | 5.0 MiB/s | 122.8 KiB | 00m00s [ 44/304] perl-overload-0:1.40-520.fc43 100% | 1.7 MiB/s | 45.6 KiB | 00m00s [ 45/304] perl-parent-1:0.244-520.fc43. 100% | 592.2 KiB/s | 14.8 KiB | 00m00s [ 46/304] perl-Errno-0:1.38-520.fc43.x8 100% | 574.7 KiB/s | 14.9 KiB | 00m00s [ 47/304] perl-Getopt-Long-1:2.58-520.f 100% | 3.0 MiB/s | 63.6 KiB | 00m00s [ 48/304] perl-Storable-1:3.37-521.fc43 100% | 4.8 MiB/s | 98.5 KiB | 00m00s [ 49/304] perl-vars-0:1.05-520.fc43.noa 100% | 618.5 KiB/s | 13.0 KiB | 00m00s [ 50/304] perl-B-0:1.89-520.fc43.x86_64 100% | 7.9 MiB/s | 177.7 KiB | 00m00s [ 51/304] perl-if-0:0.61.000-520.fc43.n 100% | 500.1 KiB/s | 14.0 KiB | 00m00s [ 52/304] perl-overloading-0:0.02-520.f 100% | 478.1 KiB/s | 12.9 KiB | 00m00s [ 53/304] libpciaccess-devel-0:0.16-16. 100% | 401.4 KiB/s | 12.4 KiB | 00m00s [ 54/304] libpciaccess-0:0.16-16.fc43.x 100% | 1.2 MiB/s | 26.2 KiB | 00m00s [ 55/304] kernel-headers-0:6.17.0-63.fc 100% | 21.7 MiB/s | 1.5 MiB | 00m00s [ 56/304] perl-4:5.42.0-520.fc43.x86_64 100% | 648.5 KiB/s | 13.6 KiB | 00m00s [ 57/304] rocm-comgr-0:19-14.rocm6.4.2. 100% | 29.6 MiB/s | 30.5 MiB | 00m01s [ 58/304] gmock-0:1.15.2-4.fc43.x86_64 100% | 2.2 MiB/s | 61.1 KiB | 00m00s [ 59/304] gtest-0:1.15.2-4.fc43.x86_64 100% | 7.0 MiB/s | 186.0 KiB | 00m00s [ 60/304] rocm-clang-libs-0:19-14.rocm6 100% | 26.1 MiB/s | 22.8 MiB | 00m01s [ 61/304] libmpc-0:1.3.1-8.fc43.x86_64 100% | 2.5 MiB/s | 70.4 KiB | 00m00s [ 62/304] libstdc++-devel-0:15.2.1-2.fc 100% | 15.0 MiB/s | 5.2 MiB | 00m00s [ 63/304] rocm-llvm-libs-0:19-14.rocm6. 100% | 9.5 MiB/s | 20.2 MiB | 00m02s [ 64/304] glibc-devel-0:2.42-4.fc43.x86 100% | 1.3 MiB/s | 487.0 KiB | 00m00s [ 65/304] make-1:4.4.1-11.fc43.x86_64 100% | 488.3 KiB/s | 578.1 KiB | 00m01s [ 66/304] emacs-filesystem-1:30.0-5.fc4 100% | 356.6 KiB/s | 7.5 KiB | 00m00s [ 67/304] less-0:679-2.fc43.x86_64 100% | 435.0 KiB/s | 195.3 KiB | 00m00s [ 68/304] man-db-0:2.13.1-2.fc43.x86_64 100% | 566.1 KiB/s | 1.3 MiB | 00m02s [ 69/304] procps-ng-0:4.0.4-7.fc43.x86_ 100% | 2.1 MiB/s | 356.7 KiB | 00m00s [ 70/304] tcl-1:9.0.2-1.fc43.x86_64 100% | 2.8 MiB/s | 1.2 MiB | 00m00s [ 71/304] cpp-0:15.2.1-2.fc43.x86_64 100% | 2.0 MiB/s | 12.9 MiB | 00m06s [ 72/304] cmake-data-0:3.31.6-4.fc43.no 100% | 3.2 MiB/s | 1.9 MiB | 00m01s [ 73/304] expat-0:2.7.2-1.fc43.x86_64 100% | 3.0 MiB/s | 118.9 KiB | 00m00s [ 74/304] jsoncpp-0:1.9.6-2.fc43.x86_64 100% | 2.0 MiB/s | 101.1 KiB | 00m00s [ 75/304] libuv-1:1.51.0-2.fc43.x86_64 100% | 2.5 MiB/s | 266.1 KiB | 00m00s [ 76/304] rhash-0:1.4.5-3.fc43.x86_64 100% | 2.8 MiB/s | 197.9 KiB | 00m00s [ 77/304] libtommath-0:1.3.1~rc1-6.fc43 100% | 483.4 KiB/s | 64.3 KiB | 00m00s [ 78/304] rocm-libc++-0:19-14.rocm6.4.2 100% | 2.5 MiB/s | 345.8 KiB | 00m00s [ 79/304] rocm-llvm-filesystem-0:19-14. 100% | 1.0 MiB/s | 24.7 KiB | 00m00s [ 80/304] vim-filesystem-2:9.1.1818-1.f 100% | 7.4 KiB/s | 15.5 KiB | 00m02s [ 81/304] rocm-lld-0:19-14.rocm6.4.2.fc 100% | 3.6 MiB/s | 1.5 MiB | 00m00s [ 82/304] rocm-clang-devel-0:19-14.rocm 100% | 3.1 MiB/s | 2.4 MiB | 00m01s [ 83/304] gcc-0:15.2.1-2.fc43.x86_64 100% | 3.9 MiB/s | 39.7 MiB | 00m10s [ 84/304] git-0:2.51.0-2.fc43.x86_64 100% | 1.9 MiB/s | 41.1 KiB | 00m00s [ 85/304] rocm-clang-runtime-devel-0:19 100% | 5.1 MiB/s | 594.0 KiB | 00m00s [ 86/304] rocm-libc++-devel-0:19-14.roc 100% | 5.3 MiB/s | 903.8 KiB | 00m00s [ 87/304] mpdecimal-0:4.0.1-2.fc43.x86_ 100% | 4.5 MiB/s | 97.1 KiB | 00m00s [ 88/304] python-pip-wheel-0:25.1.1-18. 100% | 5.5 MiB/s | 1.2 MiB | 00m00s [ 89/304] tzdata-0:2025b-3.fc43.noarch 100% | 3.9 MiB/s | 429.3 KiB | 00m00s [ 90/304] perl-mro-0:1.29-520.fc43.x86_ 100% | 1.4 MiB/s | 29.9 KiB | 00m00s [ 91/304] perl-Digest-MD5-0:2.59-520.fc 100% | 1.6 MiB/s | 35.8 KiB | 00m00s [ 92/304] perl-Fcntl-0:1.20-520.fc43.x8 100% | 1.1 MiB/s | 29.8 KiB | 00m00s [ 93/304] perl-FileHandle-0:2.05-520.fc 100% | 704.5 KiB/s | 15.5 KiB | 00m00s [ 94/304] perl-IO-0:1.55-520.fc43.x86_6 100% | 3.8 MiB/s | 78.0 KiB | 00m00s [ 95/304] perl-IO-Socket-IP-0:0.43-521. 100% | 2.1 MiB/s | 42.1 KiB | 00m00s [ 96/304] perl-POSIX-0:2.23-520.fc43.x8 100% | 2.6 MiB/s | 97.8 KiB | 00m00s [ 97/304] perl-Socket-4:2.040-2.fc43.x8 100% | 2.7 MiB/s | 54.9 KiB | 00m00s [ 98/304] perl-Symbol-0:1.09-520.fc43.n 100% | 710.2 KiB/s | 14.2 KiB | 00m00s [ 99/304] perl-Time-Local-2:1.350-520.f 100% | 1.7 MiB/s | 34.4 KiB | 00m00s [100/304] perl-locale-0:1.13-520.fc43.n 100% | 675.2 KiB/s | 13.5 KiB | 00m00s [101/304] rocm-clang-0:19-14.rocm6.4.2. 100% | 3.2 MiB/s | 16.0 MiB | 00m05s [102/304] perl-SelectSaver-0:1.02-520.f 100% | 617.0 KiB/s | 11.7 KiB | 00m00s [103/304] perl-Pod-Usage-4:2.05-520.fc4 100% | 2.0 MiB/s | 40.5 KiB | 00m00s [104/304] perl-Text-ParseWords-0:3.31-5 100% | 860.4 KiB/s | 16.3 KiB | 00m00s [105/304] perl-Class-Struct-0:0.68-520. 100% | 1.1 MiB/s | 22.1 KiB | 00m00s [106/304] perl-Digest-0:1.20-520.fc43.n 100% | 1.3 MiB/s | 24.8 KiB | 00m00s [107/304] perl-Archive-Tar-0:3.04-521.f 100% | 3.1 MiB/s | 70.9 KiB | 00m00s [108/304] perl-Attribute-Handlers-0:1.0 100% | 1.3 MiB/s | 28.1 KiB | 00m00s [109/304] perl-AutoLoader-0:5.74-520.fc 100% | 1.0 MiB/s | 21.2 KiB | 00m00s [110/304] perl-AutoSplit-0:5.74-520.fc4 100% | 1.0 MiB/s | 21.6 KiB | 00m00s [111/304] perl-Benchmark-0:1.27-520.fc4 100% | 1.2 MiB/s | 26.8 KiB | 00m00s [112/304] perl-CPAN-0:2.38-521.fc43.noa 100% | 4.8 MiB/s | 555.5 KiB | 00m00s [113/304] perl-CPAN-Meta-0:2.150010-520 100% | 4.1 MiB/s | 172.3 KiB | 00m00s [114/304] perl-CPAN-Meta-Requirements-0 100% | 1.2 MiB/s | 34.5 KiB | 00m00s [115/304] perl-CPAN-Meta-YAML-0:0.020-5 100% | 919.5 KiB/s | 26.7 KiB | 00m00s [116/304] perl-Compress-Raw-Bzip2-0:2.2 100% | 412.2 KiB/s | 35.9 KiB | 00m00s [117/304] perl-Compress-Raw-Zlib-0:2.21 100% | 2.3 MiB/s | 65.0 KiB | 00m00s [118/304] perl-Config-Extensions-0:0.03 100% | 437.8 KiB/s | 12.3 KiB | 00m00s [119/304] perl-Config-Perl-V-0:0.38-521 100% | 694.6 KiB/s | 21.5 KiB | 00m00s [120/304] perl-DBM_Filter-0:0.07-520.fc 100% | 66.5 KiB/s | 27.3 KiB | 00m00s [121/304] perl-DB_File-0:1.859-516.fc43 100% | 1.1 MiB/s | 80.8 KiB | 00m00s [122/304] perl-Devel-PPPort-0:3.73-521. 100% | 2.3 MiB/s | 219.9 KiB | 00m00s [123/304] perl-Devel-Peek-0:1.36-520.fc 100% | 1.5 MiB/s | 31.9 KiB | 00m00s [124/304] perl-Devel-SelfStubber-0:1.06 100% | 682.3 KiB/s | 14.3 KiB | 00m00s [125/304] perl-Digest-SHA-1:6.04-521.fc 100% | 1.5 MiB/s | 61.8 KiB | 00m00s [126/304] perl-DirHandle-0:1.05-520.fc4 100% | 593.1 KiB/s | 12.5 KiB | 00m00s [127/304] perl-Dumpvalue-0:2.27-520.fc4 100% | 763.4 KiB/s | 18.3 KiB | 00m00s [128/304] perl-Encode-devel-4:3.21-520. 100% | 1.8 MiB/s | 41.0 KiB | 00m00s [129/304] perl-English-0:1.11-520.fc43. 100% | 647.6 KiB/s | 13.6 KiB | 00m00s [130/304] perl-Env-0:1.06-520.fc43.noar 100% | 842.4 KiB/s | 19.4 KiB | 00m00s [131/304] perl-ExtUtils-CBuilder-1:0.28 100% | 2.0 MiB/s | 46.5 KiB | 00m00s [132/304] perl-ExtUtils-Command-2:7.76- 100% | 606.7 KiB/s | 14.0 KiB | 00m00s [133/304] perl-ExtUtils-Constant-0:0.25 100% | 1.9 MiB/s | 43.8 KiB | 00m00s [134/304] perl-ExtUtils-Embed-0:1.35-52 100% | 842.1 KiB/s | 17.7 KiB | 00m00s [135/304] perl-ExtUtils-Install-0:2.22- 100% | 1.8 MiB/s | 43.4 KiB | 00m00s [136/304] perl-ExtUtils-MM-Utils-2:7.76 100% | 548.7 KiB/s | 11.5 KiB | 00m00s [137/304] perl-ExtUtils-MakeMaker-2:7.7 100% | 3.0 MiB/s | 284.3 KiB | 00m00s [138/304] perl-File-stat-0:1.14-520.fc4 100% | 3.4 KiB/s | 17.1 KiB | 00m05s [139/304] perl-ExtUtils-Manifest-1:1.75 100% | 1.4 MiB/s | 34.0 KiB | 00m00s [140/304] perl-ExtUtils-Miniperl-0:1.14 100% | 683.2 KiB/s | 15.0 KiB | 00m00s [141/304] perl-File-Compare-0:1.100.800 100% | 662.7 KiB/s | 13.3 KiB | 00m00s [142/304] perl-File-DosGlob-0:1.12-520. 100% | 932.9 KiB/s | 19.6 KiB | 00m00s [143/304] perl-ExtUtils-ParseXS-1:3.58- 100% | 3.5 MiB/s | 215.2 KiB | 00m00s [144/304] perl-File-Fetch-0:1.08-3.fc43 100% | 1.3 MiB/s | 30.7 KiB | 00m00s [145/304] perl-File-Find-0:1.44-520.fc4 100% | 1.2 MiB/s | 25.3 KiB | 00m00s [146/304] perl-File-Path-0:2.18-520.fc4 100% | 1.7 MiB/s | 35.1 KiB | 00m00s [147/304] perl-File-Temp-1:0.231.100-52 100% | 2.7 MiB/s | 59.0 KiB | 00m00s [148/304] perl-FileCache-0:1.10-520.fc4 100% | 701.2 KiB/s | 14.7 KiB | 00m00s [149/304] perl-Filter-2:1.64-521.fc43.x 100% | 3.1 MiB/s | 79.2 KiB | 00m00s [150/304] perl-Filter-Simple-0:0.96-520 100% | 1.3 MiB/s | 26.9 KiB | 00m00s [151/304] perl-FindBin-0:1.54-520.fc43. 100% | 711.5 KiB/s | 14.2 KiB | 00m00s [152/304] perl-GDBM_File-1:1.24-520.fc4 100% | 2.0 MiB/s | 42.5 KiB | 00m00s [153/304] perl-HTTP-Tiny-0:0.090-521.fc 100% | 2.8 MiB/s | 56.3 KiB | 00m00s [154/304] perl-Hash-Util-0:0.32-520.fc4 100% | 1.6 MiB/s | 34.6 KiB | 00m00s [155/304] perl-Hash-Util-FieldHash-0:1. 100% | 1.7 MiB/s | 38.8 KiB | 00m00s [156/304] perl-I18N-Collate-0:1.02-520. 100% | 675.5 KiB/s | 14.2 KiB | 00m00s [157/304] perl-I18N-LangTags-0:0.45-520 100% | 2.2 MiB/s | 52.6 KiB | 00m00s [158/304] perl-I18N-Langinfo-0:0.24-520 100% | 1.2 MiB/s | 25.6 KiB | 00m00s [159/304] perl-IO-Zlib-1:1.15-520.fc43. 100% | 811.2 KiB/s | 19.5 KiB | 00m00s [160/304] perl-IPC-Cmd-2:1.04-521.fc43. 100% | 1.8 MiB/s | 39.6 KiB | 00m00s [161/304] perl-IO-Compress-0:2.213-521. 100% | 3.7 MiB/s | 294.4 KiB | 00m00s [162/304] perl-IPC-Open3-0:1.24-520.fc4 100% | 1.2 MiB/s | 23.9 KiB | 00m00s [163/304] perl-IPC-SysV-0:2.09-521.fc43 100% | 1.8 MiB/s | 40.6 KiB | 00m00s [164/304] perl-JSON-PP-1:4.16-521.fc43. 100% | 2.8 MiB/s | 65.5 KiB | 00m00s [165/304] perl-Locale-Maketext-0:1.33-5 100% | 4.1 MiB/s | 93.5 KiB | 00m00s [166/304] perl-Locale-Maketext-Simple-1 100% | 837.4 KiB/s | 17.6 KiB | 00m00s [167/304] perl-Math-BigInt-FastCalc-0:0 100% | 1.3 MiB/s | 28.2 KiB | 00m00s [168/304] perl-Math-Complex-0:1.63-520. 100% | 2.0 MiB/s | 46.2 KiB | 00m00s [169/304] perl-Math-BigInt-1:2.0050.03- 100% | 3.9 MiB/s | 234.5 KiB | 00m00s [170/304] perl-Memoize-0:1.17-520.fc43. 100% | 2.2 MiB/s | 46.6 KiB | 00m00s [171/304] perl-Module-CoreList-tools-1: 100% | 824.3 KiB/s | 19.0 KiB | 00m00s [172/304] perl-Module-CoreList-1:5.2025 100% | 2.2 MiB/s | 93.6 KiB | 00m00s [173/304] perl-Module-Load-1:0.36-520.f 100% | 819.8 KiB/s | 17.2 KiB | 00m00s [174/304] perl-Module-Load-Conditional- 100% | 996.2 KiB/s | 21.9 KiB | 00m00s [175/304] perl-Module-Loaded-1:0.08-520 100% | 669.9 KiB/s | 13.4 KiB | 00m00s [176/304] perl-Module-Metadata-0:1.0000 100% | 1.6 MiB/s | 35.2 KiB | 00m00s [177/304] perl-NDBM_File-0:1.18-520.fc4 100% | 1.1 MiB/s | 22.6 KiB | 00m00s [178/304] perl-NEXT-0:0.69-520.fc43.noa 100% | 997.3 KiB/s | 20.9 KiB | 00m00s [179/304] perl-Net-0:1.04-520.fc43.noar 100% | 1.1 MiB/s | 22.7 KiB | 00m00s [180/304] perl-Net-Ping-0:2.76-520.fc43 100% | 2.2 MiB/s | 49.4 KiB | 00m00s [181/304] perl-ODBM_File-0:1.20-520.fc4 100% | 1.1 MiB/s | 22.8 KiB | 00m00s [182/304] perl-Opcode-0:1.69-520.fc43.x 100% | 1.6 MiB/s | 35.8 KiB | 00m00s [183/304] perl-Params-Check-1:0.38-520. 100% | 1.0 MiB/s | 21.6 KiB | 00m00s [184/304] perl-Perl-OSType-0:1.010-521. 100% | 988.4 KiB/s | 22.7 KiB | 00m00s [185/304] perl-PerlIO-via-QuotedPrint-0 100% | 1.0 MiB/s | 21.6 KiB | 00m00s [186/304] perl-Pod-Checker-4:1.77-520.f 100% | 1.4 MiB/s | 31.6 KiB | 00m00s [187/304] perl-Pod-Escapes-1:1.07-520.f 100% | 989.0 KiB/s | 19.8 KiB | 00m00s [188/304] perl-Pod-Functions-0:1.14-520 100% | 698.8 KiB/s | 14.7 KiB | 00m00s [189/304] perl-Pod-Html-0:1.35-520.fc43 100% | 1.4 MiB/s | 29.5 KiB | 00m00s [190/304] perl-Pod-Perldoc-0:3.28.01-52 100% | 3.7 MiB/s | 78.7 KiB | 00m00s [191/304] perl-Safe-0:2.47-520.fc43.noa 100% | 1.2 MiB/s | 24.9 KiB | 00m00s [192/304] perl-Pod-Simple-1:3.47-3.fc43 100% | 3.7 MiB/s | 210.5 KiB | 00m00s [193/304] perl-Search-Dict-0:1.08-520.f 100% | 619.6 KiB/s | 13.0 KiB | 00m00s [194/304] perl-SelfLoader-0:1.28-520.fc 100% | 1.0 MiB/s | 21.4 KiB | 00m00s [195/304] perl-Sys-Hostname-0:1.25-520. 100% | 779.9 KiB/s | 17.2 KiB | 00m00s [196/304] perl-Sys-Syslog-0:0.36-521.fc 100% | 2.0 MiB/s | 46.3 KiB | 00m00s [197/304] perl-Term-ANSIColor-0:5.01-52 100% | 2.2 MiB/s | 47.6 KiB | 00m00s [198/304] perl-Term-Cap-0:1.18-520.fc43 100% | 1.1 MiB/s | 21.9 KiB | 00m00s [199/304] perl-Term-Complete-0:1.403-52 100% | 590.9 KiB/s | 13.0 KiB | 00m00s [200/304] perl-Term-ReadLine-0:1.17-520 100% | 952.2 KiB/s | 19.0 KiB | 00m00s [201/304] perl-Term-Table-0:0.025-1.fc4 100% | 1.8 MiB/s | 43.3 KiB | 00m00s [202/304] perl-Test-0:1.31-520.fc43.noa 100% | 1.3 MiB/s | 28.6 KiB | 00m00s [203/304] perl-Test-Harness-1:3.52-4.fc 100% | 4.2 MiB/s | 258.7 KiB | 00m00s [204/304] perl-Text-Abbrev-0:1.02-520.f 100% | 578.5 KiB/s | 12.1 KiB | 00m00s [205/304] perl-Text-Balanced-0:2.07-1.f 100% | 2.1 MiB/s | 48.7 KiB | 00m00s [206/304] perl-Text-Tabs+Wrap-0:2024.00 100% | 1.1 MiB/s | 21.6 KiB | 00m00s [207/304] perl-Test-Simple-3:1.302214-4 100% | 6.1 MiB/s | 800.2 KiB | 00m00s [208/304] perl-Thread-0:3.06-520.fc43.n 100% | 857.7 KiB/s | 18.0 KiB | 00m00s [209/304] perl-Thread-Queue-0:3.14-520. 100% | 967.5 KiB/s | 21.3 KiB | 00m00s [210/304] perl-Thread-Semaphore-0:2.13- 100% | 745.7 KiB/s | 15.7 KiB | 00m00s [211/304] perl-Tie-0:4.6-520.fc43.noarc 100% | 1.2 MiB/s | 27.8 KiB | 00m00s [212/304] perl-Tie-File-0:1.10-520.fc43 100% | 2.0 MiB/s | 43.2 KiB | 00m00s [213/304] perl-Tie-Memoize-0:1.1-520.fc 100% | 706.1 KiB/s | 14.1 KiB | 00m00s [214/304] perl-Tie-RefHash-0:1.41-520.f 100% | 1.1 MiB/s | 23.5 KiB | 00m00s [215/304] perl-Time-0:1.04-520.fc43.noa 100% | 841.4 KiB/s | 16.8 KiB | 00m00s [216/304] perl-Time-HiRes-4:1.9778-520. 100% | 2.5 MiB/s | 57.1 KiB | 00m00s [217/304] perl-Time-Piece-0:1.3600-520. 100% | 2.0 MiB/s | 40.4 KiB | 00m00s [218/304] perl-Unicode-Normalize-0:1.32 100% | 3.1 MiB/s | 74.0 KiB | 00m00s [219/304] perl-Unicode-UCD-0:0.81-520.f 100% | 3.4 MiB/s | 79.4 KiB | 00m00s [220/304] perl-User-pwent-0:1.05-520.fc 100% | 931.4 KiB/s | 19.6 KiB | 00m00s [221/304] perl-autodie-0:2.37-521.fc43. 100% | 4.1 MiB/s | 92.7 KiB | 00m00s [222/304] perl-autouse-0:1.11-520.fc43. 100% | 656.8 KiB/s | 13.8 KiB | 00m00s [223/304] perl-Unicode-Collate-0:1.31-5 100% | 4.7 MiB/s | 627.1 KiB | 00m00s [224/304] perl-bignum-0:0.67-521.fc43.n 100% | 1.8 MiB/s | 48.9 KiB | 00m00s [225/304] perl-blib-0:1.07-520.fc43.noa 100% | 590.8 KiB/s | 12.4 KiB | 00m00s [226/304] perl-debugger-0:1.60-520.fc43 100% | 5.4 MiB/s | 133.4 KiB | 00m00s [227/304] perl-deprecate-0:0.04-520.fc4 100% | 694.1 KiB/s | 14.6 KiB | 00m00s [228/304] perl-diagnostics-0:1.40-520.f 100% | 3.8 MiB/s | 220.5 KiB | 00m00s [229/304] perl-devel-4:5.42.0-520.fc43. 100% | 6.8 MiB/s | 648.6 KiB | 00m00s [230/304] perl-encoding-4:3.00-520.fc43 100% | 2.7 MiB/s | 62.9 KiB | 00m00s [231/304] perl-encoding-warnings-0:0.14 100% | 788.3 KiB/s | 16.6 KiB | 00m00s [232/304] perl-experimental-0:0.036-2.f 100% | 1.3 MiB/s | 27.3 KiB | 00m00s [233/304] perl-fields-0:2.27-520.fc43.n 100% | 767.5 KiB/s | 16.1 KiB | 00m00s [234/304] perl-filetest-0:1.03-520.fc43 100% | 695.2 KiB/s | 14.6 KiB | 00m00s [235/304] perl-less-0:0.03-520.fc43.noa 100% | 659.7 KiB/s | 13.2 KiB | 00m00s [236/304] perl-lib-0:0.65-520.fc43.x86_ 100% | 786.9 KiB/s | 15.0 KiB | 00m00s [237/304] perl-libnetcfg-4:5.42.0-520.f 100% | 777.4 KiB/s | 16.3 KiB | 00m00s [238/304] perl-macros-4:5.42.0-520.fc43 100% | 585.5 KiB/s | 12.3 KiB | 00m00s [239/304] perl-meta-notation-0:5.42.0-5 100% | 507.8 KiB/s | 10.7 KiB | 00m00s [240/304] perl-open-0:1.13-520.fc43.noa 100% | 786.5 KiB/s | 16.5 KiB | 00m00s [241/304] perl-perlfaq-0:5.20250619-520 100% | 6.3 MiB/s | 373.9 KiB | 00m00s [242/304] perl-ph-0:5.42.0-520.fc43.x86 100% | 1.9 MiB/s | 45.5 KiB | 00m00s [243/304] perl-podlators-1:6.0.2-520.fc 100% | 5.8 MiB/s | 124.3 KiB | 00m00s [244/304] perl-sigtrap-0:1.10-520.fc43. 100% | 745.7 KiB/s | 15.7 KiB | 00m00s [245/304] perl-sort-0:2.06-520.fc43.noa 100% | 628.4 KiB/s | 13.2 KiB | 00m00s [246/304] perl-subs-0:1.04-520.fc43.noa 100% | 557.8 KiB/s | 11.7 KiB | 00m00s [247/304] perl-threads-1:2.43-520.fc43. 100% | 2.6 MiB/s | 57.9 KiB | 00m00s [248/304] perl-threads-shared-0:1.70-52 100% | 2.0 MiB/s | 44.4 KiB | 00m00s [249/304] perl-utils-0:5.42.0-520.fc43. 100% | 2.3 MiB/s | 52.5 KiB | 00m00s [250/304] perl-version-9:0.99.33-521.fc 100% | 2.9 MiB/s | 62.8 KiB | 00m00s [251/304] perl-vmsish-0:1.04-520.fc43.n 100% | 670.5 KiB/s | 14.1 KiB | 00m00s [252/304] groff-base-0:1.23.0-10.fc43.x 100% | 11.3 MiB/s | 1.1 MiB | 00m00s [253/304] libpipeline-0:1.5.8-3.fc43.x8 100% | 2.9 MiB/s | 59.9 KiB | 00m00s [254/304] hwdata-0:0.399-1.fc43.noarch 100% | 14.4 MiB/s | 1.7 MiB | 00m00s [255/304] perl-doc-0:5.42.0-520.fc43.no 100% | 6.1 MiB/s | 4.9 MiB | 00m01s [256/304] libxcrypt-devel-0:4.4.38-8.fc 100% | 1.1 MiB/s | 29.2 KiB | 00m00s [257/304] git-core-doc-0:2.51.0-2.fc43. 100% | 18.3 MiB/s | 2.8 MiB | 00m00s [258/304] perl-Git-0:2.51.0-2.fc43.noar 100% | 1.9 MiB/s | 38.1 KiB | 00m00s [259/304] perl-TermReadKey-0:2.38-26.fc 100% | 1.7 MiB/s | 35.2 KiB | 00m00s [260/304] rocm-llvm-devel-0:19-14.rocm6 100% | 4.9 MiB/s | 3.8 MiB | 00m01s [261/304] rocm-llvm-0:19-14.rocm6.4.2.f 100% | 16.5 MiB/s | 13.1 MiB | 00m01s [262/304] git-core-0:2.51.0-2.fc43.x86_ 100% | 2.0 MiB/s | 5.0 MiB | 00m02s [263/304] systemtap-sdt-devel-0:5.3-4.f 100% | 1.5 MiB/s | 68.9 KiB | 00m00s [264/304] zlib-ng-compat-devel-0:2.2.5- 100% | 7.6 KiB/s | 38.3 KiB | 00m05s [265/304] perl-IPC-System-Simple-0:1.30 100% | 1.7 MiB/s | 38.6 KiB | 00m00s [266/304] ncurses-0:6.5-7.20250614.fc43 100% | 7.1 MiB/s | 421.5 KiB | 00m00s [267/304] perl-IO-Socket-SSL-0:2.095-2. 100% | 9.8 MiB/s | 231.5 KiB | 00m00s [268/304] perl-Net-SSLeay-0:1.94-11.fc4 100% | 14.5 MiB/s | 356.8 KiB | 00m00s [269/304] perl-Error-1:0.17030-2.fc43.n 100% | 1.9 MiB/s | 40.2 KiB | 00m00s [270/304] libdb-0:5.3.28-66.fc43.x86_64 100% | 17.3 MiB/s | 778.2 KiB | 00m00s [271/304] perl-Archive-Zip-0:1.68-17.fc 100% | 4.4 MiB/s | 104.8 KiB | 00m00s [272/304] perl-Compress-Bzip2-0:2.28-24 100% | 2.8 MiB/s | 66.6 KiB | 00m00s [273/304] perl-Devel-Size-0:0.85-3.fc43 100% | 1.4 MiB/s | 30.6 KiB | 00m00s [274/304] perl-File-HomeDir-0:1.006-15. 100% | 2.3 MiB/s | 54.9 KiB | 00m00s [275/304] perl-Module-Build-2:0.42.34-9 100% | 9.5 MiB/s | 242.7 KiB | 00m00s [276/304] perl-Module-Signature-0:0.93- 100% | 3.6 MiB/s | 87.6 KiB | 00m00s [277/304] perl-Text-Glob-0:0.11-26.fc43 100% | 628.3 KiB/s | 13.2 KiB | 00m00s [278/304] perl-local-lib-0:2.000029-10. 100% | 2.9 MiB/s | 66.1 KiB | 00m00s [279/304] perl-IO-Compress-Lzma-0:2.213 100% | 2.9 MiB/s | 71.9 KiB | 00m00s [280/304] perl-Text-Diff-0:1.45-24.fc43 100% | 1.8 MiB/s | 39.9 KiB | 00m00s [281/304] openssh-clients-0:10.0p1-5.fc 100% | 18.1 MiB/s | 742.4 KiB | 00m00s [282/304] python3-pyparsing-0:3.1.2-14. 100% | 10.5 MiB/s | 279.9 KiB | 00m00s [283/304] perl-Algorithm-Diff-0:1.2010- 100% | 2.1 MiB/s | 46.3 KiB | 00m00s [284/304] perl-Software-License-0:0.104 100% | 5.9 MiB/s | 138.2 KiB | 00m00s [285/304] perl-inc-latest-2:0.500-31.fc 100% | 1.0 MiB/s | 23.1 KiB | 00m00s [286/304] perl-Compress-Raw-Lzma-0:2.21 100% | 2.3 MiB/s | 51.2 KiB | 00m00s [287/304] libedit-0:3.1-56.20250104cvs. 100% | 4.7 MiB/s | 105.2 KiB | 00m00s [288/304] libfido2-0:1.16.0-3.fc43.x86_ 100% | 4.4 MiB/s | 98.5 KiB | 00m00s [289/304] openssh-0:10.0p1-5.fc43.x86_6 100% | 13.1 MiB/s | 335.5 KiB | 00m00s [290/304] libcbor-0:0.12.0-6.fc43.x86_6 100% | 1.6 MiB/s | 33.5 KiB | 00m00s [291/304] perl-Data-Section-0:0.200008- 100% | 1.1 MiB/s | 24.9 KiB | 00m00s [292/304] perl-Text-Template-0:1.61-8.f 100% | 2.6 MiB/s | 59.0 KiB | 00m00s [293/304] systemtap-sdt-dtrace-0:5.3-4. 100% | 13.8 KiB/s | 69.7 KiB | 00m05s [294/304] perl-MRO-Compat-0:0.15-12.fc4 100% | 1.1 MiB/s | 25.2 KiB | 00m00s [295/304] perl-Data-OptList-0:0.114-7.f 100% | 1.2 MiB/s | 26.5 KiB | 00m00s [296/304] perl-Sub-Exporter-0:0.991-6.f 100% | 1.8 MiB/s | 71.1 KiB | 00m00s [297/304] perl-Package-Generator-0:1.10 100% | 1.0 MiB/s | 22.2 KiB | 00m00s [298/304] perl-Params-Util-0:1.102-19.f 100% | 1.4 MiB/s | 32.6 KiB | 00m00s [299/304] perl-Sub-Install-0:0.929-8.fc 100% | 1.0 MiB/s | 22.6 KiB | 00m00s [300/304] gcc-plugin-annobin-0:15.2.1-2 100% | 2.4 MiB/s | 57.1 KiB | 00m00s [301/304] cmake-rpm-macros-0:3.31.6-4.f 100% | 643.8 KiB/s | 14.8 KiB | 00m00s [302/304] annobin-docs-0:12.99-1.fc43.n 100% | 4.4 MiB/s | 89.5 KiB | 00m00s [303/304] annobin-plugin-gcc-0:12.99-1. 100% | 3.3 MiB/s | 996.0 KiB | 00m00s [304/304] rocm-llvm-static-0:19-14.rocm 100% | 8.2 MiB/s | 260.8 MiB | 00m32s -------------------------------------------------------------------------------- [304/304] Total 100% | 12.2 MiB/s | 527.8 MiB | 00m43s Running transaction [ 1/306] Verify package files 100% | 153.0 B/s | 304.0 B | 00m02s [ 2/306] Prepare transaction 100% | 1.3 KiB/s | 304.0 B | 00m00s [ 3/306] Installing cmake-filesystem-0 100% | 2.5 MiB/s | 7.6 KiB | 00m00s [ 4/306] Installing expat-0:2.7.2-1.fc 100% | 18.4 MiB/s | 300.7 KiB | 00m00s [ 5/306] Installing less-0:679-2.fc43. 100% | 25.0 MiB/s | 409.4 KiB | 00m00s [ 6/306] Installing make-1:4.4.1-11.fc 100% | 81.8 MiB/s | 1.8 MiB | 00m00s [ 7/306] Installing libmpc-0:1.3.1-8.f 100% | 11.3 MiB/s | 162.1 KiB | 00m00s [ 8/306] Installing groff-base-0:1.23. 100% | 75.4 MiB/s | 3.8 MiB | 00m00s [ 9/306] Installing rocm-llvm-filesyst 100% | 2.3 MiB/s | 19.1 KiB | 00m00s [ 10/306] Installing rocm-libc++-0:19-1 100% | 32.4 MiB/s | 1.2 MiB | 00m00s [ 11/306] Installing rocm-llvm-libs-0:1 100% | 54.0 MiB/s | 84.8 MiB | 00m02s [ 12/306] Installing rocm-clang-libs-0: 100% | 68.1 MiB/s | 98.4 MiB | 00m01s [ 13/306] Installing vim-filesystem-2:9 100% | 2.3 MiB/s | 4.7 KiB | 00m00s [ 14/306] Installing emacs-filesystem-1 100% | 0.0 B/s | 544.0 B | 00m00s [ 15/306] Installing gtest-0:1.15.2-4.f 100% | 98.3 MiB/s | 503.2 KiB | 00m00s [ 16/306] Installing kernel-headers-0:6 100% | 104.2 MiB/s | 6.9 MiB | 00m00s [ 17/306] Installing glibc-devel-0:2.42 100% | 84.1 MiB/s | 2.4 MiB | 00m00s [ 18/306] Installing libxcrypt-devel-0: 100% | 32.3 MiB/s | 33.1 KiB | 00m00s [ 19/306] Installing rocm-comgr-0:19-14 100% | 65.4 MiB/s | 123.9 MiB | 00m02s [ 20/306] Installing numactl-libs-0:2.0 100% | 56.4 MiB/s | 57.8 KiB | 00m00s [ 21/306] Installing gmock-0:1.15.2-4.f 100% | 64.7 MiB/s | 132.4 KiB | 00m00s [ 22/306] Installing rocm-lld-0:19-14.r 100% | 61.0 MiB/s | 5.7 MiB | 00m00s [ 23/306] Installing rocm-libc++-devel- 100% | 59.8 MiB/s | 7.7 MiB | 00m00s [ 24/306] Installing cpp-0:15.2.1-2.fc4 100% | 271.1 MiB/s | 38.0 MiB | 00m00s [ 25/306] Installing gcc-0:15.2.1-2.fc4 100% | 296.8 MiB/s | 111.9 MiB | 00m00s [ 26/306] Installing zlib-ng-compat-dev 100% | 106.0 MiB/s | 108.5 KiB | 00m00s [ 27/306] Installing annobin-docs-0:12. 100% | 97.7 MiB/s | 100.1 KiB | 00m00s [ 28/306] Installing libcbor-0:0.12.0-6 100% | 77.3 MiB/s | 79.2 KiB | 00m00s [ 29/306] Installing libfido2-0:1.16.0- 100% | 117.2 MiB/s | 240.0 KiB | 00m00s [ 30/306] Installing openssh-0:10.0p1-5 100% | 73.3 MiB/s | 1.4 MiB | 00m00s [ 31/306] Installing libedit-0:3.1-56.2 100% | 118.1 MiB/s | 241.8 KiB | 00m00s [ 32/306] Installing openssh-clients-0: 100% | 76.7 MiB/s | 2.6 MiB | 00m00s [ 33/306] Installing git-core-0:2.51.0- 100% | 229.7 MiB/s | 23.7 MiB | 00m00s [ 34/306] Installing git-core-doc-0:2.5 100% | 136.6 MiB/s | 17.9 MiB | 00m00s [ 35/306] Installing libdb-0:5.3.28-66. 100% | 169.3 MiB/s | 1.9 MiB | 00m00s [ 36/306] Installing ncurses-0:6.5-7.20 100% | 26.2 MiB/s | 616.4 KiB | 00m00s [ 37/306] Installing perl-Digest-0:1.20 100% | 18.1 MiB/s | 37.1 KiB | 00m00s [ 38/306] Installing perl-Digest-MD5-0: 100% | 30.1 MiB/s | 61.6 KiB | 00m00s [ 39/306] Installing perl-FileHandle-0: 100% | 9.6 MiB/s | 9.8 KiB | 00m00s [ 40/306] Installing perl-B-0:1.89-520. 100% | 123.2 MiB/s | 504.7 KiB | 00m00s [ 41/306] Installing perl-libnet-0:3.15 100% | 71.9 MiB/s | 294.7 KiB | 00m00s [ 42/306] Installing perl-Data-Dumper-0 100% | 38.3 MiB/s | 117.5 KiB | 00m00s [ 43/306] Installing perl-MIME-Base32-0 100% | 31.4 MiB/s | 32.2 KiB | 00m00s [ 44/306] Installing perl-AutoLoader-0: 100% | 20.5 MiB/s | 21.0 KiB | 00m00s [ 45/306] Installing perl-URI-0:5.34-1. 100% | 39.3 MiB/s | 281.8 KiB | 00m00s [ 46/306] Installing perl-IO-Socket-IP- 100% | 49.9 MiB/s | 102.2 KiB | 00m00s [ 47/306] Installing perl-Net-SSLeay-0: 100% | 135.9 MiB/s | 1.4 MiB | 00m00s [ 48/306] Installing perl-IO-Socket-SSL 100% | 175.4 MiB/s | 718.6 KiB | 00m00s [ 49/306] Installing perl-Text-Tabs+Wra 100% | 23.3 MiB/s | 23.9 KiB | 00m00s [ 50/306] Installing perl-Pod-Escapes-1 100% | 25.3 MiB/s | 25.9 KiB | 00m00s [ 51/306] Installing perl-File-Path-0:2 100% | 63.0 MiB/s | 64.5 KiB | 00m00s [ 52/306] Installing perl-locale-0:1.13 100% | 6.4 MiB/s | 6.5 KiB | 00m00s [ 53/306] Installing perl-Time-Local-2: 100% | 68.9 MiB/s | 70.6 KiB | 00m00s [ 54/306] Installing perl-if-0:0.61.000 100% | 0.0 B/s | 6.2 KiB | 00m00s [ 55/306] Installing perl-Pod-Simple-1: 100% | 140.3 MiB/s | 574.9 KiB | 00m00s [ 56/306] Installing perl-HTTP-Tiny-0:0 100% | 152.8 MiB/s | 156.4 KiB | 00m00s [ 57/306] Installing perl-Term-Cap-0:1. 100% | 29.9 MiB/s | 30.6 KiB | 00m00s [ 58/306] Installing perl-Term-ANSIColo 100% | 96.9 MiB/s | 99.2 KiB | 00m00s [ 59/306] Installing perl-IPC-Open3-0:1 100% | 27.8 MiB/s | 28.5 KiB | 00m00s [ 60/306] Installing perl-File-Temp-1:0 100% | 160.2 MiB/s | 164.1 KiB | 00m00s [ 61/306] Installing perl-Class-Struct- 100% | 25.3 MiB/s | 25.9 KiB | 00m00s [ 62/306] Installing perl-POSIX-0:2.23- 100% | 113.6 MiB/s | 232.6 KiB | 00m00s [ 63/306] Installing perl-podlators-1:6 100% | 19.6 MiB/s | 321.4 KiB | 00m00s [ 64/306] Installing perl-Pod-Perldoc-0 100% | 9.7 MiB/s | 169.2 KiB | 00m00s [ 65/306] Installing perl-File-stat-0:1 100% | 12.8 MiB/s | 13.1 KiB | 00m00s [ 66/306] Installing perl-SelectSaver-0 100% | 0.0 B/s | 2.6 KiB | 00m00s [ 67/306] Installing perl-Symbol-0:1.09 100% | 0.0 B/s | 7.3 KiB | 00m00s [ 68/306] Installing perl-Socket-4:2.04 100% | 59.7 MiB/s | 122.3 KiB | 00m00s [ 69/306] Installing perl-Pod-Usage-4:2 100% | 6.1 MiB/s | 87.9 KiB | 00m00s [ 70/306] Installing perl-IO-0:1.55-520 100% | 74.0 MiB/s | 151.7 KiB | 00m00s [ 71/306] Installing perl-Text-ParseWor 100% | 14.2 MiB/s | 14.6 KiB | 00m00s [ 72/306] Installing perl-Fcntl-0:1.20- 100% | 48.7 MiB/s | 49.9 KiB | 00m00s [ 73/306] Installing perl-mro-0:1.29-52 100% | 41.7 MiB/s | 42.7 KiB | 00m00s [ 74/306] Installing perl-overloading-0 100% | 0.0 B/s | 5.6 KiB | 00m00s [ 75/306] Installing perl-base-0:2.27-5 100% | 0.0 B/s | 13.0 KiB | 00m00s [ 76/306] Installing perl-File-Basename 100% | 0.0 B/s | 14.6 KiB | 00m00s [ 77/306] Installing perl-Getopt-Long-1 100% | 71.9 MiB/s | 147.2 KiB | 00m00s [ 78/306] Installing perl-Storable-1:3. 100% | 113.7 MiB/s | 232.8 KiB | 00m00s [ 79/306] Installing perl-overload-0:1. 100% | 70.3 MiB/s | 72.0 KiB | 00m00s [ 80/306] Installing perl-vars-0:1.05-5 100% | 0.0 B/s | 4.3 KiB | 00m00s [ 81/306] Installing perl-Errno-0:1.38- 100% | 8.6 MiB/s | 8.8 KiB | 00m00s [ 82/306] Installing perl-parent-1:0.24 100% | 10.7 MiB/s | 11.0 KiB | 00m00s [ 83/306] Installing perl-constant-0:1. 100% | 26.7 MiB/s | 27.4 KiB | 00m00s [ 84/306] Installing perl-MIME-Base64-0 100% | 21.6 MiB/s | 44.3 KiB | 00m00s [ 85/306] Installing perl-Scalar-List-U 100% | 48.4 MiB/s | 148.7 KiB | 00m00s [ 86/306] Installing perl-Getopt-Std-0: 100% | 11.5 MiB/s | 11.8 KiB | 00m00s [ 87/306] Installing perl-Encode-4:3.21 100% | 151.4 MiB/s | 4.7 MiB | 00m00s [ 88/306] Installing perl-DynaLoader-0: 100% | 0.0 B/s | 32.5 KiB | 00m00s [ 89/306] Installing perl-PathTools-0:3 100% | 60.1 MiB/s | 184.6 KiB | 00m00s [ 90/306] Installing perl-Exporter-0:5. 100% | 54.3 MiB/s | 55.6 KiB | 00m00s [ 91/306] Installing perl-Carp-0:1.54-5 100% | 23.3 MiB/s | 47.7 KiB | 00m00s [ 92/306] Installing perl-libs-4:5.42.0 100% | 159.6 MiB/s | 11.6 MiB | 00m00s [ 93/306] Installing perl-interpreter-4 100% | 7.8 MiB/s | 120.3 KiB | 00m00s [ 94/306] Installing perl-File-Find-0:1 100% | 41.5 MiB/s | 42.5 KiB | 00m00s [ 95/306] Installing perl-version-9:0.9 100% | 64.2 MiB/s | 131.5 KiB | 00m00s [ 96/306] Installing perl-File-Copy-0:2 100% | 0.0 B/s | 20.2 KiB | 00m00s [ 97/306] Installing perl-ExtUtils-Mani 100% | 84.3 MiB/s | 86.3 KiB | 00m00s [ 98/306] Installing perl-lib-0:0.65-52 100% | 0.0 B/s | 8.9 KiB | 00m00s [ 99/306] Installing perl-threads-1:2.4 100% | 57.2 MiB/s | 117.1 KiB | 00m00s [100/306] Installing perl-threads-share 100% | 41.9 MiB/s | 85.9 KiB | 00m00s [101/306] Installing perl-Compress-Raw- 100% | 80.8 MiB/s | 165.5 KiB | 00m00s [102/306] Installing perl-File-Compare- 100% | 0.0 B/s | 6.2 KiB | 00m00s [103/306] Installing perl-Time-HiRes-4: 100% | 57.5 MiB/s | 117.8 KiB | 00m00s [104/306] Installing perl-CPAN-Meta-Req 100% | 81.5 MiB/s | 83.4 KiB | 00m00s [105/306] Installing perl-Module-CoreLi 100% | 310.1 MiB/s | 1.2 MiB | 00m00s [106/306] Installing perl-Module-Metada 100% | 67.4 MiB/s | 69.0 KiB | 00m00s [107/306] Installing perl-Digest-SHA-1: 100% | 7.0 MiB/s | 115.0 KiB | 00m00s [108/306] Installing perl-Filter-2:1.64 100% | 32.5 MiB/s | 166.2 KiB | 00m00s [109/306] Installing perl-Module-Load-1 100% | 15.5 MiB/s | 15.9 KiB | 00m00s [110/306] Installing perl-Perl-OSType-0 100% | 33.5 MiB/s | 34.3 KiB | 00m00s [111/306] Installing perl-Term-ReadLine 100% | 17.4 MiB/s | 17.9 KiB | 00m00s [112/306] Installing perl-Tie-0:4.6-520 100% | 33.1 MiB/s | 33.9 KiB | 00m00s [113/306] Installing perl-Unicode-Norma 100% | 159.0 MiB/s | 488.4 KiB | 00m00s [114/306] Installing perl-meta-notation 100% | 0.0 B/s | 2.3 KiB | 00m00s [115/306] Installing perl-encoding-4:3. 100% | 146.9 MiB/s | 150.4 KiB | 00m00s [116/306] Installing perl-Net-Ping-0:2. 100% | 132.2 MiB/s | 135.3 KiB | 00m00s [117/306] Installing perl-ExtUtils-Comm 100% | 0.0 B/s | 10.2 KiB | 00m00s [118/306] Installing perl-Pod-Html-0:1. 100% | 3.1 MiB/s | 43.9 KiB | 00m00s [119/306] Installing perl-File-Which-0: 100% | 30.7 MiB/s | 31.4 KiB | 00m00s [120/306] Installing perl-AutoSplit-0:5 100% | 0.0 B/s | 23.6 KiB | 00m00s [121/306] Installing perl-Benchmark-0:1 100% | 36.0 MiB/s | 36.8 KiB | 00m00s [122/306] Installing perl-Test-Harness- 100% | 24.8 MiB/s | 583.6 KiB | 00m00s [123/306] Installing perl-CPAN-Meta-YAM 100% | 52.3 MiB/s | 53.6 KiB | 00m00s [124/306] Installing perl-Compress-Raw- 100% | 34.0 MiB/s | 69.6 KiB | 00m00s [125/306] Installing perl-IO-Compress-0 100% | 51.6 MiB/s | 1.0 MiB | 00m00s [126/306] Installing perl-IO-Zlib-1:1.1 100% | 26.1 MiB/s | 26.7 KiB | 00m00s [127/306] Installing perl-Devel-PPPort- 100% | 217.8 MiB/s | 892.1 KiB | 00m00s [128/306] Installing perl-DirHandle-0:1 100% | 0.0 B/s | 3.8 KiB | 00m00s [129/306] Installing perl-Dumpvalue-0:2 100% | 0.0 B/s | 20.2 KiB | 00m00s [130/306] Installing perl-ExtUtils-Cons 100% | 85.7 MiB/s | 87.7 KiB | 00m00s [131/306] Installing perl-ExtUtils-MM-U 100% | 3.6 MiB/s | 3.7 KiB | 00m00s [132/306] Installing perl-Hash-Util-Fie 100% | 31.4 MiB/s | 64.3 KiB | 00m00s [133/306] Installing perl-Hash-Util-0:0 100% | 55.1 MiB/s | 56.4 KiB | 00m00s [134/306] Installing perl-fields-0:2.27 100% | 0.0 B/s | 12.3 KiB | 00m00s [135/306] Installing perl-ExtUtils-Pars 100% | 31.3 MiB/s | 545.5 KiB | 00m00s [136/306] Installing perl-ExtUtils-Make 100% | 40.7 MiB/s | 750.3 KiB | 00m00s [137/306] Installing perl-ExtUtils-Inst 100% | 85.1 MiB/s | 87.2 KiB | 00m00s [138/306] Installing perl-I18N-LangTags 100% | 81.8 MiB/s | 83.8 KiB | 00m00s [139/306] Installing perl-Locale-Makete 100% | 84.9 MiB/s | 173.9 KiB | 00m00s [140/306] Installing perl-Locale-Makete 100% | 13.2 MiB/s | 13.5 KiB | 00m00s [141/306] Installing perl-Params-Check- 100% | 27.9 MiB/s | 28.6 KiB | 00m00s [142/306] Installing perl-Module-Load-C 100% | 29.2 MiB/s | 29.9 KiB | 00m00s [143/306] Installing perl-IPC-Cmd-2:1.0 100% | 83.9 MiB/s | 85.9 KiB | 00m00s [144/306] Installing perl-Math-Complex- 100% | 84.0 MiB/s | 86.0 KiB | 00m00s [145/306] Installing perl-Math-BigInt-1 100% | 212.8 MiB/s | 1.1 MiB | 00m00s [146/306] Installing perl-JSON-PP-1:4.1 100% | 10.0 MiB/s | 143.6 KiB | 00m00s [147/306] Installing perl-CPAN-Meta-0:2 100% | 66.6 MiB/s | 613.8 KiB | 00m00s [148/306] Installing perl-NDBM_File-0:1 100% | 28.9 MiB/s | 29.6 KiB | 00m00s [149/306] Installing perl-SelfLoader-0: 100% | 0.0 B/s | 22.6 KiB | 00m00s [150/306] Installing perl-Sys-Hostname- 100% | 16.8 MiB/s | 17.2 KiB | 00m00s [151/306] Installing perl-Term-Table-0: 100% | 39.6 MiB/s | 81.2 KiB | 00m00s [152/306] Installing perl-Text-Balanced 100% | 110.2 MiB/s | 112.8 KiB | 00m00s [153/306] Installing perl-Tie-RefHash-0 100% | 36.5 MiB/s | 37.4 KiB | 00m00s [154/306] Installing perl-User-pwent-0: 100% | 17.5 MiB/s | 17.9 KiB | 00m00s [155/306] Installing perl-autouse-0:1.1 100% | 6.2 MiB/s | 6.4 KiB | 00m00s [156/306] Installing perl-subs-0:1.04-5 100% | 0.0 B/s | 2.5 KiB | 00m00s [157/306] Installing perl-Opcode-0:1.69 100% | 48.8 MiB/s | 50.0 KiB | 00m00s [158/306] Installing perl-Safe-0:2.47-5 100% | 0.0 B/s | 31.1 KiB | 00m00s [159/306] Installing perl-Params-Util-0 100% | 29.8 MiB/s | 61.0 KiB | 00m00s [160/306] Installing perl-Sub-Install-0 100% | 36.3 MiB/s | 37.2 KiB | 00m00s [161/306] Installing perl-Data-OptList- 100% | 51.0 MiB/s | 52.2 KiB | 00m00s [162/306] Installing perl-Filter-Simple 100% | 50.5 MiB/s | 51.7 KiB | 00m00s [163/306] Installing perl-Test-Simple-3 100% | 70.8 MiB/s | 1.8 MiB | 00m00s [164/306] Installing perl-Devel-SelfStu 100% | 7.2 MiB/s | 7.3 KiB | 00m00s [165/306] Installing perl-Memoize-0:1.1 100% | 65.2 MiB/s | 66.8 KiB | 00m00s [166/306] Installing perl-Math-BigInt-F 100% | 22.9 MiB/s | 46.9 KiB | 00m00s [167/306] Installing perl-bignum-0:0.67 100% | 66.6 MiB/s | 136.5 KiB | 00m00s [168/306] Installing perl-File-Fetch-0: 100% | 59.9 MiB/s | 61.3 KiB | 00m00s [169/306] Installing perl-inc-latest-2: 100% | 35.5 MiB/s | 36.3 KiB | 00m00s [170/306] Installing perl-libnetcfg-4:5 100% | 1.3 MiB/s | 17.3 KiB | 00m00s [171/306] Installing perl-DBM_Filter-0: 100% | 30.0 MiB/s | 30.7 KiB | 00m00s [172/306] Installing perl-File-HomeDir- 100% | 60.5 MiB/s | 123.8 KiB | 00m00s [173/306] Installing perl-open-0:1.13-5 100% | 0.0 B/s | 11.7 KiB | 00m00s [174/306] Installing perl-debugger-0:1. 100% | 197.4 MiB/s | 404.3 KiB | 00m00s [175/306] Installing perl-sigtrap-0:1.1 100% | 11.2 MiB/s | 11.5 KiB | 00m00s [176/306] Installing perl-Unicode-Colla 100% | 246.8 MiB/s | 4.2 MiB | 00m00s [177/306] Installing perl-Unicode-UCD-0 100% | 202.1 MiB/s | 206.9 KiB | 00m00s [178/306] Installing perl-Env-0:1.06-52 100% | 26.6 MiB/s | 27.2 KiB | 00m00s [179/306] Installing perl-Module-CoreLi 100% | 1.3 MiB/s | 19.3 KiB | 00m00s [180/306] Installing perl-Archive-Zip-0 100% | 18.2 MiB/s | 297.8 KiB | 00m00s [181/306] Installing perl-Thread-0:3.06 100% | 0.0 B/s | 12.5 KiB | 00m00s [182/306] Installing perl-Thread-Queue- 100% | 4.9 MiB/s | 30.4 KiB | 00m00s [183/306] Installing perl-Thread-Semaph 100% | 0.0 B/s | 10.6 KiB | 00m00s [184/306] Installing perl-experimental- 100% | 43.8 MiB/s | 44.8 KiB | 00m00s [185/306] Installing perl-Encode-devel- 100% | 7.6 MiB/s | 101.1 KiB | 00m00s [186/306] Installing perl-Pod-Checker-4 100% | 4.0 MiB/s | 53.5 KiB | 00m00s [187/306] Installing perl-diagnostics-0 100% | 32.9 MiB/s | 472.1 KiB | 00m00s [188/306] Installing perl-macros-4:5.42 100% | 0.0 B/s | 5.8 KiB | 00m00s [189/306] Installing perl-utils-0:5.42. 100% | 7.4 MiB/s | 98.6 KiB | 00m00s [190/306] Installing perl-Attribute-Han 100% | 0.0 B/s | 40.5 KiB | 00m00s [191/306] Installing perl-Config-Extens 100% | 0.0 B/s | 3.2 KiB | 00m00s [192/306] Installing perl-Config-Perl-V 100% | 26.9 MiB/s | 27.5 KiB | 00m00s [193/306] Installing perl-DB_File-0:1.8 100% | 93.1 MiB/s | 190.6 KiB | 00m00s [194/306] Installing perl-Devel-Peek-0: 100% | 43.8 MiB/s | 44.9 KiB | 00m00s [195/306] Installing perl-English-0:1.1 100% | 0.0 B/s | 6.7 KiB | 00m00s [196/306] Installing perl-File-DosGlob- 100% | 21.7 MiB/s | 22.2 KiB | 00m00s [197/306] Installing perl-FileCache-0:1 100% | 0.0 B/s | 7.9 KiB | 00m00s [198/306] Installing perl-FindBin-0:1.5 100% | 0.0 B/s | 7.2 KiB | 00m00s [199/306] Installing perl-GDBM_File-1:1 100% | 78.9 MiB/s | 80.8 KiB | 00m00s [200/306] Installing perl-I18N-Collate- 100% | 0.0 B/s | 7.6 KiB | 00m00s [201/306] Installing perl-I18N-Langinfo 100% | 35.3 MiB/s | 36.2 KiB | 00m00s [202/306] Installing perl-IPC-SysV-0:2. 100% | 37.4 MiB/s | 76.7 KiB | 00m00s [203/306] Installing perl-Module-Loaded 100% | 0.0 B/s | 5.6 KiB | 00m00s [204/306] Installing perl-NEXT-0:0.69-5 100% | 0.0 B/s | 24.0 KiB | 00m00s [205/306] Installing perl-Net-0:1.04-52 100% | 23.3 MiB/s | 23.9 KiB | 00m00s [206/306] Installing perl-ODBM_File-0:1 100% | 29.0 MiB/s | 29.7 KiB | 00m00s [207/306] Installing perl-PerlIO-via-Qu 100% | 31.4 MiB/s | 32.1 KiB | 00m00s [208/306] Installing perl-Pod-Functions 100% | 0.0 B/s | 14.8 KiB | 00m00s [209/306] Installing perl-Search-Dict-0 100% | 0.0 B/s | 5.2 KiB | 00m00s [210/306] Installing perl-Sys-Syslog-0: 100% | 47.3 MiB/s | 96.9 KiB | 00m00s [211/306] Installing perl-Term-Complete 100% | 0.0 B/s | 6.3 KiB | 00m00s [212/306] Installing perl-Test-0:1.31-5 100% | 0.0 B/s | 37.4 KiB | 00m00s [213/306] Installing perl-Text-Abbrev-0 100% | 0.0 B/s | 3.6 KiB | 00m00s [214/306] Installing perl-Tie-File-0:1. 100% | 84.1 MiB/s | 86.2 KiB | 00m00s [215/306] Installing perl-Tie-Memoize-0 100% | 0.0 B/s | 6.8 KiB | 00m00s [216/306] Installing perl-Time-0:1.04-5 100% | 10.7 MiB/s | 10.9 KiB | 00m00s [217/306] Installing perl-Time-Piece-0: 100% | 71.2 MiB/s | 72.9 KiB | 00m00s [218/306] Installing perl-blib-0:1.07-5 100% | 0.0 B/s | 3.6 KiB | 00m00s [219/306] Installing perl-deprecate-0:0 100% | 6.8 MiB/s | 7.0 KiB | 00m00s [220/306] Installing perl-doc-0:5.42.0- 100% | 255.8 MiB/s | 11.5 MiB | 00m00s [221/306] Installing perl-encoding-warn 100% | 0.0 B/s | 10.7 KiB | 00m00s [222/306] Installing perl-filetest-0:1. 100% | 0.0 B/s | 6.8 KiB | 00m00s [223/306] Installing perl-less-0:0.03-5 100% | 0.0 B/s | 5.3 KiB | 00m00s [224/306] Installing perl-perlfaq-0:5.2 100% | 240.2 MiB/s | 737.9 KiB | 00m00s [225/306] Installing perl-ph-0:5.42.0-5 100% | 91.7 MiB/s | 281.8 KiB | 00m00s [226/306] Installing perl-sort-0:2.06-5 100% | 0.0 B/s | 5.2 KiB | 00m00s [227/306] Installing perl-vmsish-0:1.04 100% | 0.0 B/s | 7.0 KiB | 00m00s [228/306] Installing perl-TermReadKey-0 100% | 32.3 MiB/s | 66.2 KiB | 00m00s [229/306] Installing perl-IPC-System-Si 100% | 71.8 MiB/s | 73.5 KiB | 00m00s [230/306] Installing perl-autodie-0:2.3 100% | 71.3 MiB/s | 219.1 KiB | 00m00s [231/306] Installing perl-Error-1:0.170 100% | 39.0 MiB/s | 80.0 KiB | 00m00s [232/306] Installing git-0:2.51.0-2.fc4 100% | 56.4 MiB/s | 57.7 KiB | 00m00s [233/306] Installing perl-Git-0:2.51.0- 100% | 63.8 MiB/s | 65.4 KiB | 00m00s [234/306] Installing perl-Compress-Bzip 100% | 70.9 MiB/s | 145.3 KiB | 00m00s [235/306] Installing perl-Devel-Size-0: 100% | 42.8 MiB/s | 43.8 KiB | 00m00s [236/306] Installing perl-Text-Glob-0:0 100% | 9.1 MiB/s | 9.3 KiB | 00m00s [237/306] Installing perl-local-lib-0:2 100% | 58.8 MiB/s | 120.4 KiB | 00m00s [238/306] Installing perl-Algorithm-Dif 100% | 106.9 MiB/s | 109.5 KiB | 00m00s [239/306] Installing perl-Text-Diff-0:1 100% | 83.1 MiB/s | 85.1 KiB | 00m00s [240/306] Installing perl-Module-Signat 100% | 9.7 MiB/s | 138.8 KiB | 00m00s [241/306] Installing perl-Compress-Raw- 100% | 60.2 MiB/s | 123.3 KiB | 00m00s [242/306] Installing perl-IO-Compress-L 100% | 107.6 MiB/s | 220.4 KiB | 00m00s [243/306] Installing perl-Archive-Tar-0 100% | 10.9 MiB/s | 156.9 KiB | 00m00s [244/306] Installing perl-Text-Template 100% | 111.3 MiB/s | 114.0 KiB | 00m00s [245/306] Installing perl-MRO-Compat-0: 100% | 43.8 MiB/s | 44.9 KiB | 00m00s [246/306] Installing perl-Package-Gener 100% | 30.8 MiB/s | 31.5 KiB | 00m00s [247/306] Installing perl-Sub-Exporter- 100% | 65.7 MiB/s | 201.9 KiB | 00m00s [248/306] Installing perl-Data-Section- 100% | 43.0 MiB/s | 44.1 KiB | 00m00s [249/306] Installing perl-Software-Lice 100% | 100.2 MiB/s | 513.2 KiB | 00m00s [250/306] Installing perl-Module-Build- 100% | 38.1 MiB/s | 663.2 KiB | 00m00s [251/306] Installing systemtap-sdt-deve 100% | 180.0 MiB/s | 184.3 KiB | 00m00s [252/306] Installing hwdata-0:0.399-1.f 100% | 417.2 MiB/s | 9.6 MiB | 00m00s [253/306] Installing libpciaccess-0:0.1 100% | 44.8 MiB/s | 45.9 KiB | 00m00s [254/306] Installing libdrm-0:2.4.125-2 100% | 130.1 MiB/s | 399.7 KiB | 00m00s [255/306] Installing rocm-runtime-0:6.4 100% | 341.7 MiB/s | 3.1 MiB | 00m00s [256/306] Installing rocm-runtime-devel 100% | 280.7 MiB/s | 574.9 KiB | 00m00s [257/306] Installing libpciaccess-devel 100% | 0.0 B/s | 15.9 KiB | 00m00s [258/306] Installing libdrm-devel-0:2.4 100% | 144.1 MiB/s | 737.9 KiB | 00m00s [259/306] Installing libpipeline-0:1.5. 100% | 6.8 MiB/s | 146.6 KiB | 00m00s [260/306] Installing man-db-0:2.13.1-2. 100% | 47.8 MiB/s | 2.9 MiB | 00m00s [261/306] Installing tzdata-0:2025b-3.f 100% | 25.2 MiB/s | 1.9 MiB | 00m00s [262/306] Installing python-pip-wheel-0 100% | 415.1 MiB/s | 1.2 MiB | 00m00s [263/306] Installing mpdecimal-0:4.0.1- 100% | 35.6 MiB/s | 218.8 KiB | 00m00s [264/306] Installing python3-libs-0:3.1 100% | 199.8 MiB/s | 43.3 MiB | 00m00s [265/306] Installing python3-0:3.14.0~r 100% | 2.1 MiB/s | 30.7 KiB | 00m00s [266/306] Installing cmake-rpm-macros-0 100% | 8.1 MiB/s | 8.3 KiB | 00m00s [267/306] Installing rocm-smi-0:6.4.3-1 100% | 120.7 MiB/s | 2.7 MiB | 00m00s [268/306] Installing rocm-llvm-0:19-14. 100% | 53.2 MiB/s | 48.5 MiB | 00m01s [269/306] Installing rocm-llvm-devel-0: 100% | 67.8 MiB/s | 25.7 MiB | 00m00s [270/306] Installing rocm-llvm-static-0 100% | 82.5 MiB/s | 1.8 GiB | 00m22s [271/306] Installing python3-pyparsing- 100% | 171.6 MiB/s | 1.0 MiB | 00m00s [272/306] Installing systemtap-sdt-dtra 100% | 11.0 MiB/s | 180.9 KiB | 00m00s [273/306] Installing perl-devel-4:5.42. 100% | 141.1 MiB/s | 3.8 MiB | 00m00s [274/306] Installing perl-ExtUtils-Embe 100% | 0.0 B/s | 16.1 KiB | 00m00s [275/306] Installing perl-ExtUtils-Mini 100% | 8.6 MiB/s | 8.8 KiB | 00m00s [276/306] Installing rocm-clang-runtime 100% | 104.8 MiB/s | 7.9 MiB | 00m00s [277/306] Installing rocm-clang-0:19-14 100% | 69.6 MiB/s | 70.2 MiB | 00m01s [278/306] Installing rocm-clang-devel-0 100% | 93.1 MiB/s | 23.5 MiB | 00m00s [279/306] Installing rocm-device-libs-0 100% | 78.4 MiB/s | 3.2 MiB | 00m00s [280/306] Installing rocm-comgr-devel-0 100% | 48.6 MiB/s | 99.6 KiB | 00m00s [281/306] Installing hipcc-0:19-14.rocm 100% | 29.0 MiB/s | 654.3 KiB | 00m00s [282/306] Installing rocm-hip-0:6.4.2-2 100% | 268.2 MiB/s | 24.9 MiB | 00m00s [283/306] Installing libtommath-0:1.3.1 100% | 62.3 MiB/s | 127.5 KiB | 00m00s [284/306] Installing tcl-1:9.0.2-1.fc43 100% | 103.3 MiB/s | 4.3 MiB | 00m00s [285/306] Installing rhash-0:1.4.5-3.fc 100% | 21.8 MiB/s | 356.4 KiB | 00m00s [286/306] Installing libuv-1:1.51.0-2.f 100% | 186.5 MiB/s | 573.0 KiB | 00m00s [287/306] Installing jsoncpp-0:1.9.6-2. 100% | 126.5 MiB/s | 259.2 KiB | 00m00s [288/306] Installing cmake-0:3.31.6-4.f 100% | 253.7 MiB/s | 34.5 MiB | 00m00s [289/306] Installing cmake-data-0:3.31. 100% | 54.6 MiB/s | 9.1 MiB | 00m00s [290/306] Installing procps-ng-0:4.0.4- 100% | 48.1 MiB/s | 1.0 MiB | 00m00s [291/306] Installing environment-module 100% | 40.1 MiB/s | 1.9 MiB | 00m00s [292/306] Installing libstdc++-devel-0: 100% | 295.2 MiB/s | 37.5 MiB | 00m00s [293/306] Installing gcc-c++-0:15.2.1-2 100% | 277.6 MiB/s | 41.4 MiB | 00m00s [294/306] Installing perl-ExtUtils-CBui 100% | 33.2 MiB/s | 102.1 KiB | 00m00s [295/306] Installing perl-CPAN-0:2.38-5 100% | 82.4 MiB/s | 1.9 MiB | 00m00s [296/306] Installing perl-4:5.42.0-520. 100% | 0.0 B/s | 124.0 B | 00m00s [297/306] Installing rocm-core-0:6.4.4- 100% | 13.2 MiB/s | 13.5 KiB | 00m00s [298/306] Installing rocm-core-devel-0: 100% | 15.8 MiB/s | 16.1 KiB | 00m00s [299/306] Installing hipify-0:6.4.1-3.f 100% | 140.3 MiB/s | 3.1 MiB | 00m00s [300/306] Installing rocm-rpm-macros-0: 100% | 19.0 MiB/s | 19.5 KiB | 00m00s [301/306] Installing rocm-cmake-0:6.4.0 100% | 66.2 MiB/s | 135.6 KiB | 00m00s [302/306] Installing rocm-hip-devel-0:6 100% | 120.5 MiB/s | 2.8 MiB | 00m00s [303/306] Installing rocm-smi-devel-0:6 100% | 138.7 MiB/s | 284.0 KiB | 00m00s [304/306] Installing annobin-plugin-gcc 100% | 39.5 MiB/s | 1.0 MiB | 00m00s [305/306] Installing gcc-plugin-annobin 100% | 2.1 MiB/s | 58.6 KiB | 00m00s [306/306] Installing gtest-devel-0:1.15 100% | 4.3 MiB/s | 1.1 MiB | 00m00s Warning: skipped OpenPGP checks for 304 packages from repository: https_kojipkgs_fedoraproject_org_repos_f43_build_side_119953_6609762_x86_64 Complete! Finish: build setup for rccl-6.4.2-5.fc43.src.rpm Start: rpmbuild rccl-6.4.2-5.fc43.src.rpm Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1755475200 Executing(%mkbuilddir): /bin/sh -e /var/tmp/rpm-tmp.0QIPYI Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.ifz2PT + umask 022 + cd /builddir/build/BUILD/rccl-6.4.2-build + cd /builddir/build/BUILD/rccl-6.4.2-build + rm -rf rccl-rocm-6.4.2 + /usr/lib/rpm/rpmuncompress -x /builddir/build/SOURCES/RCCL-6.4.2.tar.gz + STATUS=0 + '[' 0 -ne 0 ']' + cd rccl-rocm-6.4.2 + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w . + sed -i -e '/AMD GPU targets to compile for/d' CMakeLists.txt + sed -i -e 's@cat ${ROCM_PATH}/.info/version@echo 6.4.2@' CMakeLists.txt + sed -i -e s@rocm-core/rocm_version.h@rocm_version.h@ src/include/hip_rocm_version_info.h + sed -i -e 's@if (ENABLE_MSCCLPP AND NOT(${HOST_OS_ID} STREQUAL "ubuntu" OR ${HOST_OS_ID} STREQUAL "centos"))@if (ENABLE_MSCCLPP)@' CMakeLists.txt + sed -i '/#include ' test/common/TestBed.hpp + sed -i -e 's@set(CMAKE_CXX_STANDARD 14)@set(CMAKE_CXX_STANDARD 17)@' CMakeLists.txt ++ cat /proc/cpuinfo ++ grep -m 1 'cpu cores' ++ awk '{ print $4 }' + COMPILE_JOBS=1 + '[' 1x = x ']' + '[' 1 = 1 ']' ++ lscpu ++ grep '^CPU(s)' ++ awk '{ print $2 }' + COMPILE_JOBS=2 + '[' 2x = x ']' + BUILD_MEM=16 + MEM_KB=0 ++ cat /proc/meminfo ++ grep MemTotal ++ awk '{ print $2 }' + MEM_KB=16364836 ++ eval 'expr 16364836 / 1024' +++ expr 16364836 / 1024 + MEM_MB=15981 ++ eval 'expr 15981 / 1024' +++ expr 15981 / 1024 + MEM_GB=15 ++ eval 'expr 1 + 15 / 16' +++ expr 1 + 15 / 16 + COMPILE_JOBS_MEM=1 + '[' 1 -lt 2 ']' + COMPILE_JOBS=1 + LINK_MEM=24 ++ eval 'expr 1 + 15 / 24' +++ expr 1 + 15 / 24 + LINK_JOBS=1 + sed -i -e 's@rccl PRIVATE -parallel-jobs=12@rccl PRIVATE -parallel-jobs=1@' CMakeLists.txt + sed -i -e 's@-parallel-jobs=${num_linker_jobs}@-parallel-jobs=1@' CMakeLists.txt + RPM_EC=0 ++ jobs -p + exit 0 Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.crs9L9 + umask 022 + cd /builddir/build/BUILD/rccl-6.4.2-build + CFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=hipcc + export CC + CXX=hipcc + export CXX + cd rccl-rocm-6.4.2 + CFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=hipcc + export CC + CXX=hipcc + export CXX + /usr/bin/cmake -S . -B redhat-linux-build -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_FULL_SBINDIR:PATH=/usr/bin -DCMAKE_INSTALL_SBINDIR:PATH=bin -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON '-DAMDGPU_TARGETS=gfx90a:xnack+;gfx90a:xnack-;gfx1100;gfx1101;gfx1102;gfx1200;gfx1201' -DBUILD_FILE_REORG_BACKWARD_COMPATIBILITY=OFF -DBUILD_TESTS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_C_COMPILER=/usr/bin/hipcc -DCMAKE_CXX_COMPILER=/usr/bin/hipcc -DCMAKE_EXPORT_COMPILE_COMMANDS=OFF -DCMAKE_INSTALL_LIBDIR=/usr/lib64 -DCMAKE_SKIP_RPATH=ON -DENABLE_MSCCLPP=OFF -DHIP_PLATFORM=amd -DRCCL_ROCPROFILER_REGISTER=OFF -DROCM_PATH=/usr -DROCM_SYMLINK_LIBS=OFF CMake Deprecation Warning at CMakeLists.txt:6 (cmake_minimum_required): Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument value. Or, use the ... syntax to tell CMake that the project requires at least but has been updated to work with policies introduced by or earlier. -- CMAKE_TOOLCHAIN_FILE: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/toolchain-linux.cmake -- The CXX compiler identification is Clang 19.0.0 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/hipcc - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found GTest: /usr/lib64/cmake/GTest/GTestConfig.cmake (found suitable version "1.15.2", minimum required is "1.11") CMake Deprecation Warning at /usr/share/rocm/cmake/ROCMConfig.cmake:12 (message): Use of find_package(ROCM) is deprecated as of ROCm 6.4. Please use find_package(ROCmCMakeBuildTools) Call Stack (most recent call first): cmake/Dependencies.cmake:75 (find_package) CMakeLists.txt:55 (include) -- Checking for ROCm support for GPU targets: gfx906;gfx908;gfx90a;gfx942;gfx1030;gfx1100;gfx1101;gfx1102;gfx1200;gfx1201 -- Performing Test COMPILER_HAS_TARGET_ID_gfx906 -- Performing Test COMPILER_HAS_TARGET_ID_gfx906 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx908 -- Performing Test COMPILER_HAS_TARGET_ID_gfx908 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx90a -- Performing Test COMPILER_HAS_TARGET_ID_gfx90a - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx942 -- Performing Test COMPILER_HAS_TARGET_ID_gfx942 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1030 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1030 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1100 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1100 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1101 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1101 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1102 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1102 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1200 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1200 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1201 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1201 - Success -- Compiling for gfx906;gfx908;gfx90a;gfx942;gfx1030;gfx1100;gfx1101;gfx1102;gfx1200;gfx1201 CMake Deprecation Warning at /usr/share/rocm/cmake/ROCMConfig.cmake:12 (message): Use of find_package(ROCM) is deprecated as of ROCm 6.4. Please use find_package(ROCmCMakeBuildTools) Call Stack (most recent call first): cmake/Dependencies.cmake:75 (find_package) CMakeLists.txt:102 (include) -- ROCM_PATH found: /usr -- Compiling with hipcc -- Found Threads: TRUE -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success -- HIP compiler: clang -- HIP runtime: rocclr -- hipcc executable: /usr/bin/hipcc sh: line 1: /usr/bin/rocm_agent_enumerator: No such file or directory -- hipcc version: 6.4.43484 -- hipconfig executable: /usr/bin/hipconfig -- hipcc HIP version: 6.4.43484 -- ROCm version: 6.4.2 -- Looking for hipDeviceMallocUncached -- Looking for hipDeviceMallocUncached - found -- Looking for hipDeviceMallocContiguous -- Looking for hipDeviceMallocContiguous - found -- RCCL LL128 protocol enabled -- HSA runtime: /usr/include -- Found rocm_smi at /usr/include -- Looking for C++ include /usr/include/rocm_smi/rocm_smi64Config.h -- Looking for C++ include /usr/include/rocm_smi/rocm_smi64Config.h - found -- RSMI_INIT_FLAG_THRAD_ONLY_MUTEX supported -- Performing Test HAVE_KERNARG_PRELOAD -- Performing Test HAVE_KERNARG_PRELOAD - Success -- Kernarg preloading to SGPR enabled -- Performing Test HAVE_PARALLEL_JOBS -- Performing Test HAVE_PARALLEL_JOBS - Success -- Parallel jobs enabled CMake Warning at CMakeLists.txt:331 (message): ROCTX library not found. Skipping ROCTX linking. -- Found Python3: /usr/bin/python3.14 (found version "3.14.0") found components: Interpreter -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.h -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp -- Generating /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp -- HIP_CONTIGUOUS_MEMORY enabled -- HIP_UNCACHED_MEMORY enabled -- Use 1 jobs for linking -- Building shared RCCL library CMake Deprecation Warning at test/CMakeLists.txt:4 (cmake_minimum_required): Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument value. Or, use the ... syntax to tell CMake that the project requires at least but has been updated to work with policies introduced by or earlier. Building rccl unit tests (Installed in /test/rccl-UnitTests) hsa-runtime64 found @ /usr/lib64/cmake/hsa-runtime64 -- rocm-cmake: Set license file to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/LICENSE.txt. -- Configuring done (46.0s) -- Generating done (0.1s) CMake Warning: Manually-specified variables were not used by the project: AMDGPU_TARGETS CMAKE_CXX_FLAGS_RELEASE CMAKE_C_FLAGS_RELEASE CMAKE_Fortran_FLAGS_RELEASE CMAKE_INSTALL_DO_STRIP LIB_SUFFIX SHARE_INSTALL_PREFIX SYSCONF_INSTALL_DIR -- Build files have been written to: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build + /usr/bin/cmake --build redhat-linux-build -j2 --verbose Change Dir: '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' Run Build Command(s): /usr/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile -j2 /usr/bin/cmake -S/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2 -B/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/CMakeFiles /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build//CMakeFiles/progress.marks /usr/bin/gmake -f CMakeFiles/Makefile2 all gmake[1]: Entering directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' /usr/bin/gmake -f CMakeFiles/git_version_check.dir/build.make CMakeFiles/git_version_check.dir/depend gmake[2]: Entering directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2 /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2 /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/CMakeFiles/git_version_check.dir/DependInfo.cmake "--color=" gmake[2]: Leaving directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' /usr/bin/gmake -f CMakeFiles/git_version_check.dir/build.make CMakeFiles/git_version_check.dir/build gmake[2]: Entering directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' [ 0%] Updating git_version.cpp if necessary /usr/bin/cmake -P /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/git_version.cmake -- Updating git_version.cpp gmake[2]: Leaving directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' [ 0%] Built target git_version_check /usr/bin/gmake -f CMakeFiles/rccl.dir/build.make CMakeFiles/rccl.dir/depend gmake[2]: Entering directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' [ 0%] Hipifying src/bootstrap.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc [ 0%] Hipifying src/transport/shm.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/bootstrap.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/transport/shm.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc [ 0%] Hipifying src/channel.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/channel.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc [ 0%] Hipifying src/collectives.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/collectives.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc [ 1%] Hipifying src/debug.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/debug.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc [ 1%] Hipifying src/device/all_gather.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/all_gather.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h [ 1%] Hipifying src/device/all_reduce.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/all_reduce.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h [ 2%] Hipifying src/device/alltoall_pivot.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/alltoall_pivot.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h [ 2%] Hipifying src/device/broadcast.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/broadcast.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h [ 3%] Hipifying src/device/common.cu -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/common.cu -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h [ 3%] Hipifying src/device/common.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h [ 3%] Hipifying src/device/common_kernel.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common_kernel.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/common_kernel.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common_kernel.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common_kernel.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/common.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common_kernel.h [ 3%] Hipifying src/device/msccl_kernel_impl.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/msccl_kernel_impl.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h [ 3%] Hipifying src/device/network/unpack/unpack.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack/unpack.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/network/unpack/unpack.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack/unpack.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack/unpack.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h [ 3%] Hipifying src/device/network/unpack/unpack_defs.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack/unpack_defs.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/network/unpack/unpack_defs.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack/unpack_defs.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack/unpack_defs.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack/unpack.h [ 3%] Hipifying src/device/onerank.cu -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/onerank.cu -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack/unpack_defs.h [ 4%] Hipifying src/device/op128.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/op128.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/op128.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/op128.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/op128.h [ 4%] Hipifying src/device/primitives.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/primitives.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/op128.h [ 4%] Hipifying src/device/prims_ll.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/prims_ll.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h [ 4%] Hipifying src/device/prims_ll128.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/prims_ll128.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h [ 5%] Hipifying src/device/prims_simple.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/prims_simple.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h [ 5%] Hipifying src/device/reduce.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/reduce.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h [ 5%] Hipifying src/device/reduce_kernel.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_kernel.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/reduce_kernel.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_kernel.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_kernel.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h [ 5%] Hipifying src/device/reduce_scatter.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/reduce_scatter.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_kernel.h [ 5%] Hipifying src/device/sendrecv.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/device/sendrecv.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h [ 5%] Hipifying src/enqueue.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/enqueue.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc Added COLL_UNROLL template argument to /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h [ 6%] Hipifying src/graph/connect.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/connect.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc [ 6%] Hipifying src/graph/paths.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/paths.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc [ 6%] Hipifying src/graph/rings.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/rings.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc [ 6%] Hipifying src/graph/rings.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/rings.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.h [ 7%] Hipifying src/graph/rome_models.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/rome_models.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc [ 7%] Hipifying src/graph/rome_models.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/rome_models.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.h [ 7%] Hipifying src/graph/search.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/search.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc [ 7%] Hipifying src/graph/topo.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/topo.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc [ 7%] Hipifying src/graph/topo.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/topo.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h [ 8%] Hipifying src/graph/trees.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/trees.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/trees.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/trees.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/trees.cc [ 8%] Hipifying src/graph/tuning.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/tuning.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc [ 8%] Hipifying src/graph/xml.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/xml.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc [ 8%] Hipifying src/graph/xml.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/graph/xml.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h [ 8%] Hipifying src/group.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/group.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc [ 8%] Hipifying src/include/BfdBacktrace.hpp -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/BfdBacktrace.hpp mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/BfdBacktrace.hpp -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/BfdBacktrace.hpp && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/BfdBacktrace.hpp [ 9%] Hipifying src/include/alloc.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/alloc.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h [ 9%] Hipifying src/include/alt_rsmi.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alt_rsmi.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/alt_rsmi.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alt_rsmi.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alt_rsmi.h [ 9%] Hipifying src/include/api_trace.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/api_trace.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/api_trace.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/api_trace.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/api_trace.h [ 9%] Hipifying src/include/archinfo.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/archinfo.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/archinfo.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/archinfo.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/archinfo.h [ 10%] Hipifying src/include/argcheck.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/argcheck.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h [ 10%] Hipifying src/include/bitops.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bitops.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/bitops.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bitops.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bitops.h [ 10%] Hipifying src/include/bootstrap.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/bootstrap.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h [ 10%] Hipifying src/include/channel.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/channel.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h [ 11%] Hipifying src/include/checks.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/checks.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/checks.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/checks.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/checks.h [ 11%] Hipifying src/include/collectives.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/collectives.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h [ 11%] Hipifying src/include/coll_net.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/coll_net.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h [ 11%] Hipifying src/include/comm.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/comm.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h [ 12%] Hipifying src/include/core.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/core.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h [ 12%] Hipifying src/include/cpuset.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/cpuset.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/cpuset.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/cpuset.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/cpuset.h [ 12%] Hipifying src/include/debug.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/debug.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/debug.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/debug.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/debug.h [ 12%] Hipifying src/include/device.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/device.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h [ 13%] Hipifying src/include/enqueue.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/enqueue.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h [ 13%] Hipifying src/include/gdrwrap.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/gdrwrap.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h [ 13%] Hipifying src/include/git_version.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/git_version.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/git_version.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/git_version.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/git_version.h [ 13%] Hipifying src/include/graph.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/graph.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h [ 14%] Hipifying src/include/group.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/group.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h [ 14%] Hipifying src/include/hip_rocm_version_info.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/hip_rocm_version_info.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/hip_rocm_version_info.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/hip_rocm_version_info.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/hip_rocm_version_info.h [ 14%] Hipifying src/include/ibvcore.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvcore.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/ibvcore.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvcore.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvcore.h [ 14%] Hipifying src/include/ibvsymbols.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvsymbols.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/ibvsymbols.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvsymbols.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvsymbols.h [ 14%] Hipifying src/include/ibvwrap.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/ibvwrap.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h [ 15%] Hipifying src/include/info.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/info.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h [ 15%] Hipifying src/include/ipcsocket.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ipcsocket.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/ipcsocket.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ipcsocket.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ipcsocket.h [ 15%] Hipifying src/include/msccl/msccl_kernel.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_kernel.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/msccl/msccl_kernel.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_kernel.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_kernel.h [ 16%] Hipifying src/include/msccl/msccl_lifecycle.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_lifecycle.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/msccl/msccl_lifecycle.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_lifecycle.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_lifecycle.h [ 16%] Hipifying src/include/msccl/msccl_parser.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/msccl/msccl_parser.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h [ 16%] Hipifying src/include/msccl/msccl_scheduler.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_scheduler.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/msccl/msccl_scheduler.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_scheduler.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_scheduler.h [ 16%] Hipifying src/include/msccl/msccl_setup.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_setup.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/msccl/msccl_setup.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_setup.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_setup.h [ 17%] Hipifying src/include/msccl/msccl_status.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/msccl/msccl_status.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h [ 17%] Hipifying src/include/msccl/msccl_struct.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/msccl/msccl_struct.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h [ 17%] Hipifying src/include/nccl_common.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nccl_common.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nccl_common.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nccl_common.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nccl_common.h [ 17%] Hipifying src/include/nccl_net.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nccl_net.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nccl_net.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nccl_net.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nccl_net.h [ 18%] Hipifying src/include/nccl_tuner.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nccl_tuner.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nccl_tuner.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nccl_tuner.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nccl_tuner.h [ 18%] Hipifying src/include/net.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/net.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h [ 18%] Hipifying src/include/net_device.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net_device.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/net_device.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net_device.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net_device.h [ 18%] Hipifying src/include/npkit/npkit.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/npkit/npkit.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h [ 18%] Hipifying src/include/npkit/npkit_event.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit_event.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/npkit/npkit_event.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit_event.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit_event.h [ 18%] Hipifying src/include/npkit/npkit_struct.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit_struct.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/npkit/npkit_struct.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit_struct.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit_struct.h [ 18%] Hipifying src/include/nvmlwrap.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvmlwrap.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvmlwrap.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvmlwrap.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvmlwrap.h [ 19%] Hipifying src/include/nvtx.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h [ 20%] Hipifying src/include/nvtx3/nvToolsExt.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExt.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvToolsExt.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExt.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExt.h [ 20%] Hipifying src/include/nvtx3/nvToolsExtCounters.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtCounters.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvToolsExtCounters.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtCounters.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtCounters.h [ 20%] Hipifying src/include/nvtx3/nvToolsExtCuda.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtCuda.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvToolsExtCuda.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtCuda.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtCuda.h [ 20%] Hipifying src/include/nvtx3/nvToolsExtCudaRt.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtCudaRt.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvToolsExtCudaRt.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtCudaRt.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtCudaRt.h [ 21%] Hipifying src/include/nvtx3/nvToolsExtMem.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtMem.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvToolsExtMem.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtMem.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtMem.h [ 21%] Hipifying src/include/nvtx3/nvToolsExtMemCudaRt.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtMemCudaRt.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvToolsExtMemCudaRt.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtMemCudaRt.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtMemCudaRt.h [ 21%] Hipifying src/include/nvtx3/nvToolsExtOpenCL.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtOpenCL.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvToolsExtOpenCL.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtOpenCL.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtOpenCL.h [ 21%] Hipifying src/include/nvtx3/nvToolsExtPayload.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtPayload.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvToolsExtPayload.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtPayload.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtPayload.h [ 22%] Hipifying src/include/nvtx3/nvToolsExtPayloadHelper.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtPayloadHelper.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvToolsExtPayloadHelper.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtPayloadHelper.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtPayloadHelper.h [ 22%] Hipifying src/include/nvtx3/nvToolsExtSemanticsCounters.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtSemanticsCounters.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvToolsExtSemanticsCounters.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtSemanticsCounters.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtSemanticsCounters.h [ 22%] Hipifying src/include/nvtx3/nvToolsExtSemanticsScope.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtSemanticsScope.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvToolsExtSemanticsScope.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtSemanticsScope.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtSemanticsScope.h [ 22%] Hipifying src/include/nvtx3/nvToolsExtSync.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtSync.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvToolsExtSync.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtSync.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvToolsExtSync.h [ 23%] Hipifying src/include/nvtx3/nvtx3.hpp -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtx3.hpp mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3 && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtx3.hpp -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtx3.hpp && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtx3.hpp [ 23%] Hipifying src/include/nvtx3/nvtxDetail/nvtxExtHelperMacros.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtHelperMacros.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxExtHelperMacros.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtHelperMacros.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtHelperMacros.h [ 23%] Hipifying src/include/nvtx3/nvtxDetail/nvtxExtImpl.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImpl.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxExtImpl.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImpl.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImpl.h [ 23%] Hipifying src/include/nvtx3/nvtxDetail/nvtxExtImplCounters_v1.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImplCounters_v1.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxExtImplCounters_v1.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImplCounters_v1.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImplCounters_v1.h [ 24%] Hipifying src/include/nvtx3/nvtxDetail/nvtxExtImplMemCudaRt_v1.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImplMemCudaRt_v1.h [ 24%] Hipifying src/include/nvtx3/nvtxDetail/nvtxExtImplMem_v1.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImplMem_v1.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxExtImplMemCudaRt_v1.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImplMemCudaRt_v1.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImplMemCudaRt_v1.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxExtImplMem_v1.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImplMem_v1.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImplMem_v1.h [ 24%] Hipifying src/include/nvtx3/nvtxDetail/nvtxExtImplPayload_v1.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImplPayload_v1.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxExtImplPayload_v1.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImplPayload_v1.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtImplPayload_v1.h [ 24%] Hipifying src/include/nvtx3/nvtxDetail/nvtxExtInit.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtInit.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxExtInit.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtInit.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtInit.h [ 24%] Hipifying src/include/nvtx3/nvtxDetail/nvtxExtPayloadHelperInternal.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtPayloadHelperInternal.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxExtPayloadHelperInternal.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtPayloadHelperInternal.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtPayloadHelperInternal.h [ 25%] Hipifying src/include/nvtx3/nvtxDetail/nvtxExtPayloadTypeInfo.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtPayloadTypeInfo.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxExtPayloadTypeInfo.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtPayloadTypeInfo.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtPayloadTypeInfo.h [ 25%] Hipifying src/include/nvtx3/nvtxDetail/nvtxExtTypes.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtTypes.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxExtTypes.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtTypes.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxExtTypes.h [ 25%] Hipifying src/include/nvtx3/nvtxDetail/nvtxImpl.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImpl.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxImpl.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImpl.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImpl.h [ 25%] Hipifying src/include/nvtx3/nvtxDetail/nvtxImplCore.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplCore.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxImplCore.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplCore.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplCore.h [ 25%] Hipifying src/include/nvtx3/nvtxDetail/nvtxImplCudaRt_v3.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplCudaRt_v3.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxImplCudaRt_v3.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplCudaRt_v3.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplCudaRt_v3.h [ 26%] Hipifying src/include/nvtx3/nvtxDetail/nvtxImplCuda_v3.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplCuda_v3.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxImplCuda_v3.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplCuda_v3.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplCuda_v3.h [ 26%] Hipifying src/include/nvtx3/nvtxDetail/nvtxImplOpenCL_v3.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplOpenCL_v3.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxImplOpenCL_v3.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplOpenCL_v3.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplOpenCL_v3.h [ 26%] Hipifying src/include/nvtx3/nvtxDetail/nvtxImplSync_v3.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplSync_v3.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxImplSync_v3.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplSync_v3.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxImplSync_v3.h [ 26%] Hipifying src/include/nvtx3/nvtxDetail/nvtxInit.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxInit.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxInit.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxInit.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxInit.h [ 27%] Hipifying src/include/nvtx3/nvtxDetail/nvtxInitDecls.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxInitDecls.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxInitDecls.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxInitDecls.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxInitDecls.h [ 27%] Hipifying src/include/nvtx3/nvtxDetail/nvtxInitDefs.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxInitDefs.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxInitDefs.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxInitDefs.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxInitDefs.h [ 27%] Hipifying src/include/nvtx3/nvtxDetail/nvtxLinkOnce.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxLinkOnce.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxLinkOnce.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxLinkOnce.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxLinkOnce.h [ 27%] Hipifying src/include/nvtx3/nvtxDetail/nvtxTypes.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxTypes.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx3/nvtxDetail/nvtxTypes.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxTypes.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx3/nvtxDetail/nvtxTypes.h [ 27%] Hipifying src/include/nvtx_stub.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx_stub.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/nvtx_stub.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx_stub.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx_stub.h [ 27%] Hipifying src/include/p2p.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/p2p.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h [ 27%] Hipifying src/include/param.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/param.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/param.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/param.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/param.h [ 27%] Hipifying src/include/profiler.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/profiler.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h [ 28%] Hipifying src/include/proxy.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/proxy.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h [ 28%] Hipifying src/include/rccl_common.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/rccl_common.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h [ 29%] Hipifying src/include/rccl_float8.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/rccl_float8.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h [ 29%] Hipifying src/include/rccl_vars.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_vars.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/rccl_vars.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_vars.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_vars.h [ 29%] Hipifying src/include/register.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/register.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/register.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/register.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/register.h [ 29%] Hipifying src/include/rocm_smi_wrap.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rocm_smi_wrap.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/rocm_smi_wrap.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rocm_smi_wrap.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rocm_smi_wrap.h [ 29%] Hipifying src/include/rocmwrap.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rocmwrap.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/rocmwrap.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rocmwrap.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rocmwrap.h [ 29%] Hipifying src/include/roctx.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/roctx.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h [ 30%] Hipifying src/include/shm.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/shm.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/shm.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/shm.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/shm.h [ 30%] Hipifying src/include/signals.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/signals.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/signals.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/signals.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/signals.h [ 30%] Hipifying src/include/socket.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/socket.h [ 30%] Hipifying src/include/strongstream.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/strongstream.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/strongstream.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/strongstream.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/strongstream.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/socket.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/socket.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/socket.h [ 30%] Hipifying src/include/timer.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/timer.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/timer.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/timer.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/timer.h [ 31%] Hipifying src/include/transport.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/transport.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/transport.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/transport.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/transport.h [ 31%] Hipifying src/include/trees.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/trees.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/trees.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/trees.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/trees.h [ 31%] Hipifying src/include/tuner.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/tuner.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h [ 31%] Hipifying src/include/utils.h -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/include/utils.h -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h [ 31%] Hipifying src/init.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/init.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc [ 32%] Hipifying src/init_nvtx.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/init_nvtx.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc [ 33%] Hipifying src/misc/alt_rsmi.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/alt_rsmi.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc [ 33%] Hipifying src/misc/api_trace.c -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.c mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/api_trace.c -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.c && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.c [ 34%] Hipifying src/misc/api_trace.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/api_trace.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc [ 34%] Hipifying src/misc/archinfo.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/archinfo.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/archinfo.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/archinfo.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/archinfo.cc [ 34%] Hipifying src/misc/argcheck.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/argcheck.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc [ 34%] Hipifying src/misc/ibvsymbols.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/ibvsymbols.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc [ 34%] Hipifying src/misc/ibvwrap.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/ibvwrap.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc [ 34%] Hipifying src/misc/ipcsocket.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/ipcsocket.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc [ 34%] Hipifying src/misc/msccl/msccl_lifecycle.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/msccl/msccl_lifecycle.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc [ 35%] Hipifying src/misc/msccl/msccl_parser.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/msccl/msccl_parser.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc [ 35%] Hipifying src/misc/msccl/msccl_setup.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/msccl/msccl_setup.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc [ 35%] Hipifying src/misc/msccl/msccl_status.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/msccl/msccl_status.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc [ 35%] Hipifying src/misc/npkit.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/npkit.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc [ 36%] Hipifying src/misc/nvmlwrap_stub.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/nvmlwrap_stub.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/nvmlwrap_stub.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/nvmlwrap_stub.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/nvmlwrap_stub.cc [ 36%] Hipifying src/misc/param.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/param.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/param.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/param.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/param.cc [ 36%] Hipifying src/misc/profiler.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/profiler.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc [ 36%] Hipifying src/misc/rocm_smi_wrap.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/rocm_smi_wrap.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc [ 37%] Hipifying src/misc/rocmwrap.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocmwrap.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/rocmwrap.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocmwrap.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocmwrap.cc [ 37%] Hipifying src/misc/roctx.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/roctx.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc [ 37%] Hipifying src/misc/shmutils.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/shmutils.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc [ 37%] Hipifying src/misc/signals.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/signals.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/signals.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/signals.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/signals.cc [ 38%] Hipifying src/misc/socket.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/socket.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc [ 38%] Hipifying src/misc/strongstream.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/strongstream.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/strongstream.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/strongstream.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/strongstream.cc [ 38%] Hipifying src/misc/tuner.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/tuner.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc [ 38%] Hipifying src/misc/utils.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/misc/utils.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc [ 38%] Hipifying src/msccl.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/msccl.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc [ 38%] Hipifying src/net.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/net.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc [ 38%] Hipifying src/proxy.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/proxy.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc [ 39%] Hipifying src/rccl_wrap.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/rccl_wrap.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc [ 39%] Hipifying src/register.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/register.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc [ 39%] Hipifying src/transport.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/transport.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc [ 39%] Hipifying src/transport/coll_net.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/transport/coll_net.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc [ 40%] Hipifying src/transport/generic.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/transport/generic.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc [ 40%] Hipifying src/transport/net_ib.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/transport/net_ib.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc [ 40%] Hipifying src/transport/net_socket.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/transport/net_socket.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc [ 40%] Hipifying src/transport/net.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/transport/net.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc [ 41%] Hipifying src/transport/nvls.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/transport/nvls.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc [ 41%] Hipifying src/transport/p2p.cc -> /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport && /usr/bin/hipify-perl -quiet-warnings /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/src/transport/p2p.cc -o /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc && /usr/bin/cmake -E env bash /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/add_unroll.sh /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2 /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2 /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/CMakeFiles/rccl.dir/DependInfo.cmake "--color=" gmake[2]: Leaving directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' /usr/bin/gmake -f CMakeFiles/rccl.dir/build.make CMakeFiles/rccl.dir/build gmake[2]: Entering directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' [ 41%] Building CXX object CMakeFiles/rccl.dir/hipify/src/channel.cc.o [ 42%] Building CXX object CMakeFiles/rccl.dir/hipify/src/bootstrap.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/channel.cc.o -MF CMakeFiles/rccl.dir/hipify/src/channel.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/channel.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/bootstrap.cc.o -MF CMakeFiles/rccl.dir/hipify/src/bootstrap.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/bootstrap.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ 8 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ 8 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 8 warnings generated when compiling for gfx1101. 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ 8 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ 8 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ 8 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ 8 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ 8 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/bootstrap.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/bootstrap.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 42%] Building CXX object CMakeFiles/rccl.dir/hipify/src/collectives.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/collectives.cc.o -MF CMakeFiles/rccl.dir/hipify/src/collectives.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/collectives.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ 8 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:93:38: warning: unused variable 'AllGatherSchema' [-Wunused-variable] 93 | constexpr nvtxPayloadSchemaEntry_t AllGatherSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:98:23: warning: unused variable 'payload' [-Wunused-variable] 98 | NvtxParamsAllGather payload{sendcount * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:126:45: warning: unused variable 'AllReduceSchema' [-Wunused-variable] 126 | static constexpr nvtxPayloadSchemaEntry_t AllReduceSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:132:23: warning: unused variable 'payload' [-Wunused-variable] 132 | NvtxParamsAllReduce payload{count * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:161:38: warning: unused variable 'AllToAllSchema' [-Wunused-variable] 161 | constexpr nvtxPayloadSchemaEntry_t AllToAllSchema[] = { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:166:22: warning: unused variable 'payload' [-Wunused-variable] 166 | NvtxParamsAllToAll payload{count * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:212:38: warning: unused variable 'AllToAllvSchema' [-Wunused-variable] 212 | constexpr nvtxPayloadSchemaEntry_t AllToAllvSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:219:23: warning: unused variable 'payload' [-Wunused-variable] 219 | NvtxParamsAllToAllv payload{sendcounts[comm->rank] * ncclTypeSize(datatype), recvcounts[comm->rank] * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:261:38: warning: unused variable 'BroadcastSchema' [-Wunused-variable] 261 | constexpr nvtxPayloadSchemaEntry_t BroadcastSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:267:23: warning: unused variable 'payload' [-Wunused-variable] 267 | NvtxParamsBroadcast payload{count * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:301:40: warning: unused variable 'GatherSchema' [-Wunused-variable] 301 | constexpr nvtxPayloadSchemaEntry_t GatherSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:307:22: warning: unused variable 'payload' [-Wunused-variable] 307 | NvtxParamsGather payload{sendcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:343:38: warning: unused variable 'ReduceSchema' [-Wunused-variable] 343 | constexpr nvtxPayloadSchemaEntry_t ReduceSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:351:20: warning: unused variable 'payload' [-Wunused-variable] 351 | NvtxParamsReduce payload{count * ncclTypeSize(datatype), root, op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:378:38: warning: unused variable 'ReduceScatterSchema' [-Wunused-variable] 378 | constexpr nvtxPayloadSchemaEntry_t ReduceScatterSchema[] = { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:385:27: warning: unused variable 'payload' [-Wunused-variable] 385 | NvtxParamsReduceScatter payload{recvcount * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:412:40: warning: unused variable 'ScatterSchema' [-Wunused-variable] 412 | constexpr nvtxPayloadSchemaEntry_t ScatterSchema[] = { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:418:23: warning: unused variable 'payload' [-Wunused-variable] 418 | NvtxParamsScatter payload{recvcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:461:22: warning: unused variable 'payload' [-Wunused-variable] 461 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:486:22: warning: unused variable 'payload' [-Wunused-variable] 486 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:448:42: warning: unused variable 'SendRecvSchema' [-Wunused-const-variable] 448 | constexpr const nvtxPayloadSchemaEntry_t SendRecvSchema[] = { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ 8 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:93:38: warning: unused variable 'AllGatherSchema' [-Wunused-variable] 93 | constexpr nvtxPayloadSchemaEntry_t AllGatherSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:98:23: warning: unused variable 'payload' [-Wunused-variable] 98 | NvtxParamsAllGather payload{sendcount * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:126:45: warning: unused variable 'AllReduceSchema' [-Wunused-variable] 126 | static constexpr nvtxPayloadSchemaEntry_t AllReduceSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:132:23: warning: unused variable 'payload' [-Wunused-variable] 132 | NvtxParamsAllReduce payload{count * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:161:38: warning: unused variable 'AllToAllSchema' [-Wunused-variable] 161 | constexpr nvtxPayloadSchemaEntry_t AllToAllSchema[] = { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:166:22: warning: unused variable 'payload' [-Wunused-variable] 166 | NvtxParamsAllToAll payload{count * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:212:38: warning: unused variable 'AllToAllvSchema' [-Wunused-variable] 212 | constexpr nvtxPayloadSchemaEntry_t AllToAllvSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:219:23: warning: unused variable 'payload' [-Wunused-variable] 219 | NvtxParamsAllToAllv payload{sendcounts[comm->rank] * ncclTypeSize(datatype), recvcounts[comm->rank] * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:261:38: warning: unused variable 'BroadcastSchema' [-Wunused-variable] 261 | constexpr nvtxPayloadSchemaEntry_t BroadcastSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:267:23: warning: unused variable 'payload' [-Wunused-variable] 267 | NvtxParamsBroadcast payload{count * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:301:40: warning: unused variable 'GatherSchema' [-Wunused-variable] 301 | constexpr nvtxPayloadSchemaEntry_t GatherSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:307:22: warning: unused variable 'payload' [-Wunused-variable] 307 | NvtxParamsGather payload{sendcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:343:38: warning: unused variable 'ReduceSchema' [-Wunused-variable] 343 | constexpr nvtxPayloadSchemaEntry_t ReduceSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:351:20: warning: unused variable 'payload' [-Wunused-variable] 351 | NvtxParamsReduce payload{count * ncclTypeSize(datatype), root, op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:378:38: warning: unused variable 'ReduceScatterSchema' [-Wunused-variable] 378 | constexpr nvtxPayloadSchemaEntry_t ReduceScatterSchema[] = { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:385:27: warning: unused variable 'payload' [-Wunused-variable] 385 | NvtxParamsReduceScatter payload{recvcount * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:412:40: warning: unused variable 'ScatterSchema' [-Wunused-variable] 412 | constexpr nvtxPayloadSchemaEntry_t ScatterSchema[] = { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:418:23: warning: unused variable 'payload' [-Wunused-variable] 418 | NvtxParamsScatter payload{recvcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:461:22: warning: unused variable 'payload' [-Wunused-variable] 461 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:486:22: warning: unused variable 'payload' [-Wunused-variable] 486 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:448:42: warning: unused variable 'SendRecvSchema' [-Wunused-const-variable] 448 | constexpr const nvtxPayloadSchemaEntry_t SendRecvSchema[] = { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/channel.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:93:38: warning: unused variable 'AllGatherSchema' [-Wunused-variable] 93 | constexpr nvtxPayloadSchemaEntry_t AllGatherSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:98:23: warning: unused variable 'payload' [-Wunused-variable] 98 | NvtxParamsAllGather payload{sendcount * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:126:45: warning: unused variable 'AllReduceSchema' [-Wunused-variable] 126 | static constexpr nvtxPayloadSchemaEntry_t AllReduceSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:132:23: warning: unused variable 'payload' [-Wunused-variable] 132 | NvtxParamsAllReduce payload{count * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:161:38: warning: unused variable 'AllToAllSchema' [-Wunused-variable] 161 | constexpr nvtxPayloadSchemaEntry_t AllToAllSchema[] = { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:166:22: warning: unused variable 'payload' [-Wunused-variable] 166 | NvtxParamsAllToAll payload{count * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:212:38: warning: unused variable 'AllToAllvSchema' [-Wunused-variable] 212 | constexpr nvtxPayloadSchemaEntry_t AllToAllvSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:219:23: warning: unused variable 'payload' [-Wunused-variable] 219 | NvtxParamsAllToAllv payload{sendcounts[comm->rank] * ncclTypeSize(datatype), recvcounts[comm->rank] * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:261:38: warning: unused variable 'BroadcastSchema' [-Wunused-variable] 261 | constexpr nvtxPayloadSchemaEntry_t BroadcastSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:267:23: warning: unused variable 'payload' [-Wunused-variable] 267 | NvtxParamsBroadcast payload{count * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:301:40: warning: unused variable 'GatherSchema' [-Wunused-variable] 301 | constexpr nvtxPayloadSchemaEntry_t GatherSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:307:22: warning: unused variable 'payload' [-Wunused-variable] 307 | NvtxParamsGather payload{sendcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:343:38: warning: unused variable 'ReduceSchema' [-Wunused-variable] 343 | constexpr nvtxPayloadSchemaEntry_t ReduceSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:351:20: warning: unused variable 'payload' [-Wunused-variable] 351 | NvtxParamsReduce payload{count * ncclTypeSize(datatype), root, op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:378:38: warning: unused variable 'ReduceScatterSchema' [-Wunused-variable] 378 | constexpr nvtxPayloadSchemaEntry_t ReduceScatterSchema[] = { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:385:27: warning: unused variable 'payload' [-Wunused-variable] 385 | NvtxParamsReduceScatter payload{recvcount * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:412:40: warning: unused variable 'ScatterSchema' [-Wunused-variable] 412 | constexpr nvtxPayloadSchemaEntry_t ScatterSchema[] = { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:418:23: warning: unused variable 'payload' [-Wunused-variable] 418 | NvtxParamsScatter payload{recvcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:461:22: warning: unused variable 'payload' [-Wunused-variable] 461 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:486:22: warning: unused variable 'payload' [-Wunused-variable] 486 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ 8 warnings generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:448:42: warning: unused variable 'SendRecvSchema' [-Wunused-const-variable] 448 | constexpr const nvtxPayloadSchemaEntry_t SendRecvSchema[] = { | ^~~~~~~~~~~~~~ [ 42%] Building CXX object CMakeFiles/rccl.dir/hipify/src/debug.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/debug.cc.o -MF CMakeFiles/rccl.dir/hipify/src/debug.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/debug.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc 31 warnings generated when compiling for gfx1101. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:188:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 188 | hipGetDevice(&cudaDev); | ^~~~~~~~~~~~ ~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:93:38: warning: unused variable 'AllGatherSchema' [-Wunused-variable] 93 | constexpr nvtxPayloadSchemaEntry_t AllGatherSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:98:23: warning: unused variable 'payload' [-Wunused-variable] 98 | NvtxParamsAllGather payload{sendcount * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:126:45: warning: unused variable 'AllReduceSchema' [-Wunused-variable] 126 | static constexpr nvtxPayloadSchemaEntry_t AllReduceSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:132:23: warning: unused variable 'payload' [-Wunused-variable] 132 | NvtxParamsAllReduce payload{count * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:161:38: warning: unused variable 'AllToAllSchema' [-Wunused-variable] 161 | constexpr nvtxPayloadSchemaEntry_t AllToAllSchema[] = { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:166:22: warning: unused variable 'payload' [-Wunused-variable] 166 | NvtxParamsAllToAll payload{count * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:212:38: warning: unused variable 'AllToAllvSchema' [-Wunused-variable] 212 | constexpr nvtxPayloadSchemaEntry_t AllToAllvSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:219:23: warning: unused variable 'payload' [-Wunused-variable] 219 | NvtxParamsAllToAllv payload{sendcounts[comm->rank] * ncclTypeSize(datatype), recvcounts[comm->rank] * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:261:38: warning: unused variable 'BroadcastSchema' [-Wunused-variable] 261 | constexpr nvtxPayloadSchemaEntry_t BroadcastSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:267:23: warning: unused variable 'payload' [-Wunused-variable] 267 | NvtxParamsBroadcast payload{count * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:301:40: warning: unused variable 'GatherSchema' [-Wunused-variable] 301 | constexpr nvtxPayloadSchemaEntry_t GatherSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:307:22: warning: unused variable 'payload' [-Wunused-variable] 307 | NvtxParamsGather payload{sendcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:343:38: warning: unused variable 'ReduceSchema' [-Wunused-variable] 343 | constexpr nvtxPayloadSchemaEntry_t ReduceSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:351:20: warning: unused variable 'payload' [-Wunused-variable] 351 | NvtxParamsReduce payload{count * ncclTypeSize(datatype), root, op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:378:38: warning: unused variable 'ReduceScatterSchema' [-Wunused-variable] 378 | constexpr nvtxPayloadSchemaEntry_t ReduceScatterSchema[] = { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:385:27: warning: unused variable 'payload' [-Wunused-variable] 385 | NvtxParamsReduceScatter payload{recvcount * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:412:40: warning: unused variable 'ScatterSchema' [-Wunused-variable] 412 | constexpr nvtxPayloadSchemaEntry_t ScatterSchema[] = { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:418:23: warning: unused variable 'payload' [-Wunused-variable] 418 | NvtxParamsScatter payload{recvcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:461:22: warning: unused variable 'payload' [-Wunused-variable] 461 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:486:22: warning: unused variable 'payload' [-Wunused-variable] 486 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:448:42: warning: unused variable 'SendRecvSchema' [-Wunused-const-variable] 448 | constexpr const nvtxPayloadSchemaEntry_t SendRecvSchema[] = { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx1102. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:188:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 188 | hipGetDevice(&cudaDev); | ^~~~~~~~~~~~ ~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:93:38: warning: unused variable 'AllGatherSchema' [-Wunused-variable] 93 | constexpr nvtxPayloadSchemaEntry_t AllGatherSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:98:23: warning: unused variable 'payload' [-Wunused-variable] 98 | NvtxParamsAllGather payload{sendcount * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:126:45: warning: unused variable 'AllReduceSchema' [-Wunused-variable] 126 | static constexpr nvtxPayloadSchemaEntry_t AllReduceSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:132:23: warning: unused variable 'payload' [-Wunused-variable] 132 | NvtxParamsAllReduce payload{count * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:161:38: warning: unused variable 'AllToAllSchema' [-Wunused-variable] 161 | constexpr nvtxPayloadSchemaEntry_t AllToAllSchema[] = { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:166:22: warning: unused variable 'payload' [-Wunused-variable] 166 | NvtxParamsAllToAll payload{count * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:212:38: warning: unused variable 'AllToAllvSchema' [-Wunused-variable] 212 | constexpr nvtxPayloadSchemaEntry_t AllToAllvSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:219:23: warning: unused variable 'payload' [-Wunused-variable] 219 | NvtxParamsAllToAllv payload{sendcounts[comm->rank] * ncclTypeSize(datatype), recvcounts[comm->rank] * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:261:38: warning: unused variable 'BroadcastSchema' [-Wunused-variable] 261 | constexpr nvtxPayloadSchemaEntry_t BroadcastSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:267:23: warning: unused variable 'payload' [-Wunused-variable] 267 | NvtxParamsBroadcast payload{count * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:301:40: warning: unused variable 'GatherSchema' [-Wunused-variable] 301 | constexpr nvtxPayloadSchemaEntry_t GatherSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:307:22: warning: unused variable 'payload' [-Wunused-variable] 307 | NvtxParamsGather payload{sendcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:343:38: warning: unused variable 'ReduceSchema' [-Wunused-variable] 343 | constexpr nvtxPayloadSchemaEntry_t ReduceSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:351:20: warning: unused variable 'payload' [-Wunused-variable] 351 | NvtxParamsReduce payload{count * ncclTypeSize(datatype), root, op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:378:38: warning: unused variable 'ReduceScatterSchema' [-Wunused-variable] 378 | constexpr nvtxPayloadSchemaEntry_t ReduceScatterSchema[] = { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:385:27: warning: unused variable 'payload' [-Wunused-variable] 385 | NvtxParamsReduceScatter payload{recvcount * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:412:40: warning: unused variable 'ScatterSchema' [-Wunused-variable] 412 | constexpr nvtxPayloadSchemaEntry_t ScatterSchema[] = { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:418:23: warning: unused variable 'payload' [-Wunused-variable] 418 | NvtxParamsScatter payload{recvcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:461:22: warning: unused variable 'payload' [-Wunused-variable] 461 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:486:22: warning: unused variable 'payload' [-Wunused-variable] 486 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:448:42: warning: unused variable 'SendRecvSchema' [-Wunused-const-variable] 448 | constexpr const nvtxPayloadSchemaEntry_t SendRecvSchema[] = { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx1200. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:188:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 188 | hipGetDevice(&cudaDev); | ^~~~~~~~~~~~ ~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:93:38: warning: unused variable 'AllGatherSchema' [-Wunused-variable] 93 | constexpr nvtxPayloadSchemaEntry_t AllGatherSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:98:23: warning: unused variable 'payload' [-Wunused-variable] 98 | NvtxParamsAllGather payload{sendcount * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:126:45: warning: unused variable 'AllReduceSchema' [-Wunused-variable] 126 | static constexpr nvtxPayloadSchemaEntry_t AllReduceSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:132:23: warning: unused variable 'payload' [-Wunused-variable] 132 | NvtxParamsAllReduce payload{count * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:161:38: warning: unused variable 'AllToAllSchema' [-Wunused-variable] 161 | constexpr nvtxPayloadSchemaEntry_t AllToAllSchema[] = { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:166:22: warning: unused variable 'payload' [-Wunused-variable] 166 | NvtxParamsAllToAll payload{count * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:212:38: warning: unused variable 'AllToAllvSchema' [-Wunused-variable] 212 | constexpr nvtxPayloadSchemaEntry_t AllToAllvSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:219:23: warning: unused variable 'payload' [-Wunused-variable] 219 | NvtxParamsAllToAllv payload{sendcounts[comm->rank] * ncclTypeSize(datatype), recvcounts[comm->rank] * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:261:38: warning: unused variable 'BroadcastSchema' [-Wunused-variable] 261 | constexpr nvtxPayloadSchemaEntry_t BroadcastSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:267:23: warning: unused variable 'payload' [-Wunused-variable] 267 | NvtxParamsBroadcast payload{count * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:301:40: warning: unused variable 'GatherSchema' [-Wunused-variable] 301 | constexpr nvtxPayloadSchemaEntry_t GatherSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:307:22: warning: unused variable 'payload' [-Wunused-variable] 307 | NvtxParamsGather payload{sendcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:343:38: warning: unused variable 'ReduceSchema' [-Wunused-variable] 343 | constexpr nvtxPayloadSchemaEntry_t ReduceSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:351:20: warning: unused variable 'payload' [-Wunused-variable] 351 | NvtxParamsReduce payload{count * ncclTypeSize(datatype), root, op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:378:38: warning: unused variable 'ReduceScatterSchema' [-Wunused-variable] 378 | constexpr nvtxPayloadSchemaEntry_t ReduceScatterSchema[] = { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:385:27: warning: unused variable 'payload' [-Wunused-variable] 385 | NvtxParamsReduceScatter payload{recvcount * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:412:40: warning: unused variable 'ScatterSchema' [-Wunused-variable] 412 | constexpr nvtxPayloadSchemaEntry_t ScatterSchema[] = { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:418:23: warning: unused variable 'payload' [-Wunused-variable] 418 | NvtxParamsScatter payload{recvcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:461:22: warning: unused variable 'payload' [-Wunused-variable] 461 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:486:22: warning: unused variable 'payload' [-Wunused-variable] 486 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:448:42: warning: unused variable 'SendRecvSchema' [-Wunused-const-variable] 448 | constexpr const nvtxPayloadSchemaEntry_t SendRecvSchema[] = { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx1201. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:188:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 188 | hipGetDevice(&cudaDev); | ^~~~~~~~~~~~ ~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:93:38: warning: unused variable 'AllGatherSchema' [-Wunused-variable] 93 | constexpr nvtxPayloadSchemaEntry_t AllGatherSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:98:23: warning: unused variable 'payload' [-Wunused-variable] 98 | NvtxParamsAllGather payload{sendcount * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:126:45: warning: unused variable 'AllReduceSchema' [-Wunused-variable] 126 | static constexpr nvtxPayloadSchemaEntry_t AllReduceSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:132:23: warning: unused variable 'payload' [-Wunused-variable] 132 | NvtxParamsAllReduce payload{count * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:161:38: warning: unused variable 'AllToAllSchema' [-Wunused-variable] 161 | constexpr nvtxPayloadSchemaEntry_t AllToAllSchema[] = { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:166:22: warning: unused variable 'payload' [-Wunused-variable] 166 | NvtxParamsAllToAll payload{count * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:212:38: warning: unused variable 'AllToAllvSchema' [-Wunused-variable] 212 | constexpr nvtxPayloadSchemaEntry_t AllToAllvSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:219:23: warning: unused variable 'payload' [-Wunused-variable] 219 | NvtxParamsAllToAllv payload{sendcounts[comm->rank] * ncclTypeSize(datatype), recvcounts[comm->rank] * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:261:38: warning: unused variable 'BroadcastSchema' [-Wunused-variable] 261 | constexpr nvtxPayloadSchemaEntry_t BroadcastSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:267:23: warning: unused variable 'payload' [-Wunused-variable] 267 | NvtxParamsBroadcast payload{count * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:301:40: warning: unused variable 'GatherSchema' [-Wunused-variable] 301 | constexpr nvtxPayloadSchemaEntry_t GatherSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:307:22: warning: unused variable 'payload' [-Wunused-variable] 307 | NvtxParamsGather payload{sendcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:343:38: warning: unused variable 'ReduceSchema' [-Wunused-variable] 343 | constexpr nvtxPayloadSchemaEntry_t ReduceSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:351:20: warning: unused variable 'payload' [-Wunused-variable] 351 | NvtxParamsReduce payload{count * ncclTypeSize(datatype), root, op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:378:38: warning: unused variable 'ReduceScatterSchema' [-Wunused-variable] 378 | constexpr nvtxPayloadSchemaEntry_t ReduceScatterSchema[] = { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:385:27: warning: unused variable 'payload' [-Wunused-variable] 385 | NvtxParamsReduceScatter payload{recvcount * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:412:40: warning: unused variable 'ScatterSchema' [-Wunused-variable] 412 | constexpr nvtxPayloadSchemaEntry_t ScatterSchema[] = { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:418:23: warning: unused variable 'payload' [-Wunused-variable] 418 | NvtxParamsScatter payload{recvcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:461:22: warning: unused variable 'payload' [-Wunused-variable] 461 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:486:22: warning: unused variable 'payload' [-Wunused-variable] 486 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:448:42: warning: unused variable 'SendRecvSchema' [-Wunused-const-variable] 448 | constexpr const nvtxPayloadSchemaEntry_t SendRecvSchema[] = { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx906. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:188:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 188 | hipGetDevice(&cudaDev); | ^~~~~~~~~~~~ ~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:93:38: warning: unused variable 'AllGatherSchema' [-Wunused-variable] 93 | constexpr nvtxPayloadSchemaEntry_t AllGatherSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:98:23: warning: unused variable 'payload' [-Wunused-variable] 98 | NvtxParamsAllGather payload{sendcount * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:126:45: warning: unused variable 'AllReduceSchema' [-Wunused-variable] 126 | static constexpr nvtxPayloadSchemaEntry_t AllReduceSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:132:23: warning: unused variable 'payload' [-Wunused-variable] 132 | NvtxParamsAllReduce payload{count * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:161:38: warning: unused variable 'AllToAllSchema' [-Wunused-variable] 161 | constexpr nvtxPayloadSchemaEntry_t AllToAllSchema[] = { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:166:22: warning: unused variable 'payload' [-Wunused-variable] 166 | NvtxParamsAllToAll payload{count * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:212:38: warning: unused variable 'AllToAllvSchema' [-Wunused-variable] 212 | constexpr nvtxPayloadSchemaEntry_t AllToAllvSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:219:23: warning: unused variable 'payload' [-Wunused-variable] 219 | NvtxParamsAllToAllv payload{sendcounts[comm->rank] * ncclTypeSize(datatype), recvcounts[comm->rank] * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:261:38: warning: unused variable 'BroadcastSchema' [-Wunused-variable] 261 | constexpr nvtxPayloadSchemaEntry_t BroadcastSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:267:23: warning: unused variable 'payload' [-Wunused-variable] 267 | NvtxParamsBroadcast payload{count * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:301:40: warning: unused variable 'GatherSchema' [-Wunused-variable] 301 | constexpr nvtxPayloadSchemaEntry_t GatherSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:307:22: warning: unused variable 'payload' [-Wunused-variable] 307 | NvtxParamsGather payload{sendcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:343:38: warning: unused variable 'ReduceSchema' [-Wunused-variable] 343 | constexpr nvtxPayloadSchemaEntry_t ReduceSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:351:20: warning: unused variable 'payload' [-Wunused-variable] 351 | NvtxParamsReduce payload{count * ncclTypeSize(datatype), root, op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:378:38: warning: unused variable 'ReduceScatterSchema' [-Wunused-variable] 378 | constexpr nvtxPayloadSchemaEntry_t ReduceScatterSchema[] = { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:385:27: warning: unused variable 'payload' [-Wunused-variable] 385 | NvtxParamsReduceScatter payload{recvcount * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:412:40: warning: unused variable 'ScatterSchema' [-Wunused-variable] 412 | constexpr nvtxPayloadSchemaEntry_t ScatterSchema[] = { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:418:23: warning: unused variable 'payload' [-Wunused-variable] 418 | NvtxParamsScatter payload{recvcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:461:22: warning: unused variable 'payload' [-Wunused-variable] 461 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:486:22: warning: unused variable 'payload' [-Wunused-variable] 486 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:448:42: warning: unused variable 'SendRecvSchema' [-Wunused-const-variable] 448 | constexpr const nvtxPayloadSchemaEntry_t SendRecvSchema[] = { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:188:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 188 | hipGetDevice(&cudaDev); | ^~~~~~~~~~~~ ~~~~~~~~ 31 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:93:38: warning: unused variable 'AllGatherSchema' [-Wunused-variable] 93 | constexpr n/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:188:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 188 | hipGetDevice(&cudaDev); | ^~~~~~~~~~~~ ~~~~~~~~ vtxPayloadSchemaEntry_t AllGatherSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:98:23: warning: unused variable 'payload' [-Wunused-variable] 98 | NvtxParamsAllGather payload{sendcount * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:126:45: warning: unused variable 'AllReduceSchema' [-Wunused-variable] 126 | static constexpr nvtxPayloadSchemaEntry_t AllReduceSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:132:23: warning: unused variable 'payload' [-Wunused-variable] 132 | NvtxParamsAllReduce payload{count * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:161:38: warning: unused variable 'AllToAllSchema' [-Wunused-variable] 161 | constexpr nvtxPayloadSchemaEntry_t AllToAllSchema[] = { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:166:22: warning: unused variable 'payload' [-Wunused-variable] 166 | NvtxParamsAllToAll payload{count * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:212:38: warning: unused variable 'AllToAllvSchema' [-Wunused-variable] 212 | constexpr nvtxPayloadSchemaEntry_t AllToAllvSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:219:23: warning: unused variable 'payload' [-Wunused-variable] 219 | NvtxParamsAllToAllv payload{sendcounts[comm->rank] * ncclTypeSize(datatype), recvcounts[comm->rank] * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:261:38: warning: unused variable 'BroadcastSchema' [-Wunused-variable] 261 | constexpr nvtxPayloadSchemaEntry_t BroadcastSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:267:23: warning: unused variable 'payload' [-Wunused-variable] 267 | NvtxParamsBroadcast payload{count * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:301:40: warning: unused variable 'GatherSchema' [-Wunused-variable] 301 | constexpr nvtxPayloadSchemaEntry_t GatherSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:307:22: warning: unused variable 'payload' [-Wunused-variable] 307 | NvtxParamsGather payload{sendcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:343:38: warning: unused variable 'ReduceSchema' [-Wunused-variable] 343 | constexpr nvtxPayloadSchemaEntry_t ReduceSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:351:20: warning: unused variable 'payload' [-Wunused-variable] 351 | NvtxParamsReduce payload{count * ncclTypeSize(datatype), root, op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:378:38: warning: unused variable 'ReduceScatterSchema' [-Wunused-variable] 378 | constexpr nvtxPayloadSchemaEntry_t ReduceScatterSchema[] = { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:385:27: warning: unused variable 'payload' [-Wunused-variable] 385 | NvtxParamsReduceScatter payload{recvcount * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:412:40: warning: unused variable 'ScatterSchema' [-Wunused-variable] 412 | constexpr nvtxPayloadSchemaEntry_t ScatterSchema[] = { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:418:23: warning: unused variable 'payload' [-Wunused-variable] 418 | NvtxParamsScatter payload{recvcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:461:22: warning: unused variable 'payload' [-Wunused-variable] 461 | NvtxParamsSendReIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ cv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:486:22: warning: unused variable 'payload' [-Wunused-variable] 486 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:448:42: warning: unused variable 'SendRecvSchema' [-Wunused-const-variable] 448 | constexpr const nvtxPayloadSchemaEntry_t SendRecvSchema[] = { | ^~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx906. 31 warnings generated when compiling for gfx90a. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:188:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 188 | hipGetDevice(&cudaDev); | ^~~~~~~~~~~~ ~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx908. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:93:38: warning: unused variable 'AllGatherSchema' [-Wunused-variable] 93 | constexpr nvtxPayloadSchemaEntry_t AllGatherSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:98:23: warning: unused variable 'payload' [-Wunused-variable] 98 | NvtxParamsAllGather payload{sendcount * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:126:45: warning: unused variable 'AllReduceSchema' [-Wunused-variable] 126 | static constexpr nvtxPayloadSchemaEntry_t AllReduceSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:132:23: warning: unused variable 'payload' [-Wunused-variable] 132 | NvtxParamsAllReduce payload{count * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:161:38: warning: unused variable 'AllToAllSchema' [-Wunused-variable] 161 | constexpr nvtxPayloadSchemaEntry_t AllToAllSchema[] = { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:166:22: warning: unused variable 'payload' [-Wunused-variable] 166 | NvtxParamsAllToAll payload{count * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:212:38: warning: unused variable 'AllToAllvSchema' [-Wunused-variable] 212 | constexpr nvtxPayloadSchemaEntry_t AllToAllvSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:219:23: warning: unused variable 'payload' [-Wunused-variable] 219 | NvtxParamsAllToAllv payload{sendcounts[comm->rank] * ncclTypeSize(datatype), recvcounts[comm->rank] * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:261:38: warning: unused variable 'BroadcastSchema' [-Wunused-variable] 261 | constexpr nvtxPayloadSchemaEntry_t BroadcastSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:267:23: warning: unused variable 'payload' [-Wunused-variable] 267 | NvtxParamsBroadcast payload{count * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:301:40: warning: unused variable 'GatherSchema' [-Wunused-variable] 301 | constexpr nvtxPayloadSchemaEntry_t GatherSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:307:22: warning: unused variable 'payload' [-Wunused-variable] 307 | NvtxParamsGather payload{sendcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:343:38: warning: unused variable 'ReduceSchema' [-Wunused-variable] 343 | constexpr nvtxPayloadSchemaEntry_t ReduceSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:351:20: warning: unused variable 'payload' [-Wunused-variable] 351 | NvtxParamsReduce payload{count * ncclTypeSize(datatype), root, op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:378:38: warning: unused variable 'ReduceScatterSchema' [-Wunused-variable] 378 | constexpr nvtxPayloadSchemaEntry_t ReduceScatterSchema[] = { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:385:27: warning: unused variable 'payload' [-Wunused-variable] 385 | NvtxParamsReduceScatter payload{recvcount * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:412:40: warning: unused variable 'ScatterSchema' [-Wunused-variable] 412 | constexpr nvtxPayloadSchemaEntry_t ScatterSchema[] = { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:418:23: warning: unused variable 'payload' [-Wunused-variable] 418 | NvtxParamsScatter payload{recvcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:461:22: warning: unused variable 'payload' [-Wunused-variable] 461 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:486:22: warning: unused variable 'payload' [-Wunused-variable] 486 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:448:42: warning: unused variable 'SendRecvSchema' [-Wunused-const-variable] 448 | constexpr const nvtxPayloadSchemaEntry_t SendRecvSchema[] = { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx942. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:188:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 188 | hipGetDevice(&cudaDev); | ^~~~~~~~~~~~ ~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:93:38: warning: unused variable 'AllGatherSchema' [-Wunused-variable] 93 | constexpr nvtxPayloadSchemaEntry_t AllGatherSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:98:23: warning: unused variable 'payload' [-Wunused-variable] 98 | NvtxParamsAllGather payload{sendcount * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:126:45: warning: unused variable 'AllReduceSchema' [-Wunused-variable] 126 | static constexpr nvtxPayloadSchemaEntry_t AllReduceSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:132:23: warning: unused variable 'payload' [-Wunused-variable] 132 | NvtxParamsAllReduce payload{count * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:161:38: warning: unused variable 'AllToAllSchema' [-Wunused-variable] 161 | constexpr nvtxPayloadSchemaEntry_t AllToAllSchema[] = { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:166:22: warning: unused variable 'payload' [-Wunused-variable] 166 | NvtxParamsAllToAll payload{count * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:212:38: warning: unused variable 'AllToAllvSchema' [-Wunused-variable] 212 | constexpr nvtxPayloadSchemaEntry_t AllToAllvSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:219:23: warning: unused variable 'payload' [-Wunused-variable] 219 | NvtxParamsAllToAllv payload{sendcounts[comm->rank] * ncclTypeSize(datatype), recvcounts[comm->rank] * ncclTypeSize(datatype), datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:261:38: warning: unused variable 'BroadcastSchema' [-Wunused-variable] 261 | constexpr nvtxPayloadSchemaEntry_t BroadcastSchema[] = { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:267:23: warning: unused variable 'payload' [-Wunused-variable] 267 | NvtxParamsBroadcast payload{count * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:301:40: warning: unused variable 'GatherSchema' [-Wunused-variable] 301 | constexpr nvtxPayloadSchemaEntry_t GatherSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:307:22: warning: unused variable 'payload' [-Wunused-variable] 307 | NvtxParamsGather payload{sendcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:343:38: warning: unused variable 'ReduceSchema' [-Wunused-variable] 343 | constexpr nvtxPayloadSchemaEntry_t ReduceSchema[] = { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:351:20: warning: unused variable 'payload' [-Wunused-variable] 351 | NvtxParamsReduce payload{count * ncclTypeSize(datatype), root, op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:378:38: warning: unused variable 'ReduceScatterSchema' [-Wunused-variable] 378 | constexpr nvtxPayloadSchemaEntry_t ReduceScatterSchema[] = { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:385:27: warning: unused variable 'payload' [-Wunused-variable] 385 | NvtxParamsReduceScatter payload{recvcount * ncclTypeSize(datatype), op, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:412:40: warning: unused variable 'ScatterSchema' [-Wunused-variable] 412 | constexpr nvtxPayloadSchemaEntry_t ScatterSchema[] = { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:418:23: warning: unused variable 'payload' [-Wunused-variable] 418 | NvtxParamsScatter payload{recvcount * ncclTypeSize(datatype), root, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:461:22: warning: unused variable 'payload' [-Wunused-variable] 461 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:486:22: warning: unused variable 'payload' [-Wunused-variable] 486 | NvtxParamsSendRecv payload{count * ncclTypeSize(datatype), peer, datatype}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/collectives.cc:448:42: warning: unused variable 'SendRecvSchema' [-Wunused-const-variable] 448 | constexpr const nvtxPayloadSchemaEntry_t SendRecvSchema[] = { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for host. [ 43%] Building CXX object CMakeFiles/rccl.dir/hipify/src/enqueue.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/enqueue.cc.o -MF CMakeFiles/rccl.dir/hipify/src/enqueue.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/enqueue.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:188:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 188 | hipGetDevice(&cudaDev); | ^~~~~~~~~~~~ ~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:188:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 188 | hipGetDevice(&cudaDev); | ^~~~~~~~~~~~ ~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/debug.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ 2 warnings generated when compiling for host. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:72:5: warning: unused label 'ignore0' [-Wunused-label] 72 | ignore0:; | ^~~~~~~~ [ 43%] Building CXX object CMakeFiles/rccl.dir/hipify/src/group.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/group.cc.o -MF CMakeFiles/rccl.dir/hipify/src/group.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/group.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:103:22: warning: unused function 'ncclFuncSendCount' [-Wunused-function] 103 | static inline size_t ncclFuncSendCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:106:22: warning: unused function 'ncclFuncRecvCount' [-Wunused-function] 106 | static inline size_t ncclFuncRecvCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:274:21: warning: unused function 'cleanupIpc' [-Wunused-function] 274 | static ncclResult_t cleanupIpc(struct ncclComm* comm, struct ncclCommCallback* cb) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:1069:12: warning: unused function 'calcP2pChannelCount' [-Wunused-function] 1069 | static int calcP2pChannelCount(size_t totalSize, int minChannels, int maxChannels, size_t minSize, size_t maxSize) { | ^~~~~~~~~~~~~~~~~~~ 35 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:72:5: warning: unused label 'ignore0' [-Wunused-label] 72 | ignore0:; | ^~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:103:22: warning: unused function 'ncclFuncSendCount' [-Wunused-function] 103 | static inline size_t ncclFuncSendCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:106:22: warning: unused function 'ncclFuncRecvCount' [-Wunused-function] 106 | static inline size_t ncclFuncRecvCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:274:21: warning: unused function 'cleanupIpc' [-Wunused-function] 274 | static ncclResult_t cleanupIpc(struct ncclComm* comm, struct ncclCommCallback* cb) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:1069:12: warning: unused function 'calcP2pChannelCount' [-Wunused-function] 1069 | static int calcP2pChannelCount(size_t totalSize, int minChannels, int maxChannels, size_t minSize, size_t maxSize) { | ^~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 35 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:72:5: warning: unused label 'ignore0' [-Wunused-label] 72 | ignore0:; | ^~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:103:22: warning: unused function 'ncclFuncSendCount' [-Wunused-function] 103 | static inline size_t ncclFuncSendCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:106:22: warning: unused function 'ncclFuncRecvCount' [-Wunused-function] 106 | static inline size_t ncclFuncRecvCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:274:21: warning: unused function 'cleanupIpc' [-Wunused-function] 274 | static ncclResult_t cleanupIpc(struct ncclComm* comm, struct ncclCommCallback* cb) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:1069:12: warning: unused function 'calcP2pChannelCount' [-Wunused-function] 1069 | static int calcP2pChannelCount(size_t totalSize, int minChannels, int maxChannels, size_t minSize, size_t maxSize) { | ^~~~~~~~~~~~~~~~~~~ 35 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:72:5: warning: unused label 'ignore0' [-Wunused-label] 72 | ignore0:; | ^~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:103:22: warning: unused function 'ncclFuncSendCount' [-Wunused-function] 103 | static inline size_t ncclFuncSendCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:106:22: warning: unused function 'ncclFuncRecvCount' [-Wunused-function] 106 | static inline size_t ncclFuncRecvCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:274:21: warning: unused function 'cleanupIpc' [-Wunused-function] 274 | static ncclResult_t cleanupIpc(struct ncclComm* comm, struct ncclCommCallback* cb) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:1069:12: warning: unused function 'calcP2pChannelCount' [-Wunused-function] 1069 | static int calcP2pChannelCount(size_t totalSize, int minChannels, int maxChannels, size_t minSize, size_t maxSize) { | ^~~~~~~~~~~~~~~~~~~ 35 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:72:5: warning: unused label 'ignore0' [-Wunused-label] 72 | ignore0:; | ^~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:103:22: warning: unused function 'ncclFuncSendCount' [-Wunused-function] 103 | static inline size_t ncclFuncSendCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:106:22: warning: unused function 'ncclFuncRecvCount' [-Wunused-function] 106 | static inline size_t ncclFuncRecvCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:274:21: warning: unused function 'cleanupIpc' [-Wunused-function] 274 | static ncclResult_t cleanupIpc(struct ncclComm* comm, struct ncclCommCallback* cb) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:1069:12: warning: unused function 'calcP2pChannelCount' [-Wunused-function] 1069 | static int calcP2pChannelCount(size_t totalSize, int minChannels, int maxChannels, size_t minSize, size_t maxSize) { | ^~~~~~~~~~~~~~~~~~~ 35 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:72:5: warning: unused label 'ignore0' [-Wunused-label] 72 | ignore0:; | ^~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:103:22: warning: unused function 'ncclFuncSendCount' [-Wunused-function] 103 | static inline size_t ncclFuncSendCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:106:22: warning: unused function 'ncclFuncRecvCount' [-Wunused-function] 106 | static inline size_t ncclFuncRecvCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:274:21: warning: unused function 'cleanupIpc' [-Wunused-function] 274 | static ncclResult_t cleanupIpc(struct ncclComm* comm, struct ncclCommCallback* cb) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:1069:12: warning: unused function 'calcP2pChannelCount' [-Wunused-function] 1069 | static int calcP2pChannelCount(size_t totalSize, int minChannels, int maxChannels, size_t minSize, size_t maxSize) { | ^~~~~~~~~~~~~~~~~~~ 35 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:72:5: warning: unused label 'ignore0' [-Wunused-label] 72 | ignore0:; | ^~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:103:22: warning: unused function 'ncclFuncSendCount' [-Wunused-function] 103 | static inline size_t ncclFuncSendCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:106:22: warning: unused function 'ncclFuncRecvCount' [-Wunused-function] 106 | static inline size_t ncclFuncRecvCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:274:21: warning: unused function 'cleanupIpc' [-Wunused-function] 274 | static ncclResult_t cleanupIpc(struct ncclComm* comm, struct ncclCommCallback* cb) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:1069:12: warning: unused function 'calcP2pChannelCount' [-Wunused-function] 1069 | static int calcP2pChannelCount(size_t totalSize, int minChannels, int maxChannels, size_t minSize, size_t maxSize) { | ^~~~~~~~~~~~~~~~~~~ 35 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:72:5: warning: unused label 'ignore0' [-Wunused-label] 72 | ignore0:; | ^~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:103:22: warning: unused function 'ncclFuncSendCount' [-Wunused-function] 103 | static inline size_t ncclFuncSendCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:106:22: warning: unused function 'ncclFuncRecvCount' [-Wunused-function] 106 | static inline size_t ncclFuncRecvCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:274:21: warning: unused function 'cleanupIpc' [-Wunused-function] 274 | static ncclResult_t cleanupIpc(struct ncclComm* comm, struct ncclCommCallback* cb) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:1069:12: warning: unused function 'calcP2pChannelCount' [-Wunused-function] 1069 | static int calcP2pChannelCount(size_t totalSize, int minChannels, int maxChannels, size_t minSize, size_t maxSize) { | ^~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 35 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/group.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/group.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 43%] Building CXX object CMakeFiles/rccl.dir/hipify/src/init.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/init.cc.o -MF CMakeFiles/rccl.dir/hipify/src/init.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/init.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:72:5: warning: unused label 'ignore0' [-Wunused-label] 72 | ignore0:; | ^~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:103:22: warning: unused function 'ncclFuncSendCount' [-Wunused-function] 103 | static inline size_t ncclFuncSendCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:106:22: warning: unused function 'ncclFuncRecvCount' [-Wunused-function] 106 | static inline size_t ncclFuncRecvCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:274:21: warning: unused function 'cleanupIpc' [-Wunused-function] 274 | static ncclResult_t cleanupIpc(struct ncclComm* comm, struct ncclCommCallback* cb) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:1069:12: warning: unused function 'calcP2pChannelCount' [-Wunused-function] 1069 | static int calcP2pChannelCount(size_t totalSize, int minChannels, int maxChannels, size_t minSize, size_t maxSize) { | ^~~~~~~~~~~~~~~~~~~ 35 warnings generated when compiling for gfx90a. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1857:11: warning: unused variable 'stackSize' [-Wunused-variable] 1857 | int64_t stackSize; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1858:19: warning: unused variable 'devProp' [-Wunused-variable] 1858 | hipDeviceProp_t devProp; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2264:26: warning: unused variable 'payload' [-Wunused-variable] 2264 | NvtxParamsCommInitRank payload{myrank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2278:38: warning: unused variable 'CommInitAllSchema' [-Wunused-variable] 2278 | constexpr nvtxPayloadSchemaEntry_t CommInitAllSchema[] = { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2563:26: warning: unused variable 'payload' [-Wunused-variable] 2563 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2598:26: warning: unused variable 'payload' [-Wunused-variable] 2598 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2875:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 2875 | hipSetDevice(saveDevice); | ^~~~~~~~~~~~ ~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:39: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:40: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:86:21: warning: unused function 'commReclaim' [-Wunused-function] 86 | static ncclResult_t commReclaim(ncclComm_t comm); | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2249:36: warning: unused variable 'CommInitRankSchema' [-Wunused-const-variable] 2249 | constexpr nvtxPayloadSchemaEntry_t CommInitRankSchema[] = { | ^~~~~~~~~~~~~~~~~~ 58 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:72:5: warning: unused label 'ignore0' [-Wunused-label] 72 | ignore0:; | ^~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:103:22: warning: unused function 'ncclFuncSendCount' [-Wunused-function] 103 | static inline size_t ncclFuncSendCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:106:22: warning: unused function 'ncclFuncRecvCount' [-Wunused-function] 106 | static inline size_t ncclFuncRecvCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:274:21: warning: unused function 'cleanupIpc' [-Wunused-function] 274 | static ncclResult_t cleanupIpc(struct ncclComm* comm, struct ncclCommCallback* cb) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:1069:12: warning: unused function 'calcP2pChannelCount' [-Wunused-function] 1069 | static int calcP2pChannelCount(size_t totalSize, int minChannels, int maxChannels, size_t minSize, size_t maxSize) { | ^~~~~~~~~~~~~~~~~~~ 35 warnings generated when compiling for gfx942. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1857:11: warning: unused variable 'stackSize' [-Wunused-variable] 1857 | int64_t stackSize; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1858:19: warning: unused variable 'devProp' [-Wunused-variable] 1858 | hipDeviceProp_t devProp; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2264:26: warning: unused variable 'payload' [-Wunused-variable] 2264 | NvtxParamsCommInitRank payload{myrank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2278:38: warning: unused variable 'CommInitAllSchema' [-Wunused-variable] 2278 | constexpr nvtxPayloadSchemaEntry_t CommInitAllSchema[] = { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2563:26: warning: unused variable 'payload' [-Wunused-variable] 2563 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2598:26: warning: unused variable 'payload' [-Wunused-variable] 2598 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2875:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 2875 | hipSetDevice(saveDevice); | ^~~~~~~~~~~~ ~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:39: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:40: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:86:21: warning: unused function 'commReclaim' [-Wunused-function] 86 | static ncclResult_t commReclaim(ncclComm_t comm); | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2249:36: warning: unused variable 'CommInitRankSchema' [-Wunused-const-variable] 2249 | constexpr nvtxPayloadSchemaEntry_t CommInitRankSchema[] = { | ^~~~~~~~~~~~~~~~~~ 58 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:72:5: warning: unused label 'ignore0' [-Wunused-label] 72 | ignore0:; | ^~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; }In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:16: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:224:21: warning: unused function 'ncclGdrCudaFree' [-Wunused-function] 224 | static ncclResult_t ncclGdrCudaFree(void* gdrHandle) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:103:22: warning: unused function 'ncclFuncSendCount' [-Wunused-function] 103 | static inline size_t ncclFuncSendCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:106:22: warning: unused function 'ncclFuncRecvCount' [-Wunused-function] 106 | static inline size_t ncclFuncRecvCount(ncclFunc_t func, int nRanks, size_t count) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:274:21: warning: unused function 'cleanupIpc' [-Wunused-function] 274 | static ncclResult_t cleanupIpc(struct ncclComm* comm, struct ncclCommCallback* cb) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/enqueue.cc:1069:12: warning: unused function 'calcP2pChannelCount' [-Wunused-function] 1069 | static int calcP2pChannelCount(size_t totalSize, int minChannels, int maxChannels, size_t minSize, size_t maxSize) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1857:11: warning: unused variable 'stackSize' [-Wunused-variable] 1857 | int64_t stackSize; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1858:19: warning: unused variable 'devProp' [-Wunused-variable] 1858 | hipDeviceProp_t devProp; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2264:26: warning: unused variable 'payload' [-Wunused-variable] 2264 | NvtxParamsCommInitRank payload{myrank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2278:38: warning: unused variable 'CommInitAllSchema' [-Wunused-variable] 2278 | constexpr nvtxPayloadSchemaEntry_t CommInitAllSchema[] = { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2563:26: warning: unused variable 'payload' [-Wunused-variable] 2563 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2598:26: warning: unused variable 'payload' [-Wunused-variable] 2598 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2875:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 2875 | hipSetDevice(saveDevice); | ^~~~~~~~~~~~ ~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:39: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:40: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:86:21: warning: unused function 'commReclaim' [-Wunused-function] 86 | static ncclResult_t commReclaim(ncclComm_t comm); | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2249:36: warning: unused variable 'CommInitRankSchema' [-Wunused-const-variable] 2249 | constexpr nvtxPayloadSchemaEntry_t CommInitRankSchema[] = { | ^~~~~~~~~~~~~~~~~~ 58 warnings generated when compiling for gfx1101. 35 warnings generated when compiling for host. [ 43%] Building CXX object CMakeFiles/rccl.dir/hipify/src/init_nvtx.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/init_nvtx.cc.o -MF CMakeFiles/rccl.dir/hipify/src/init_nvtx.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/init_nvtx.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:4:42: warning: unused variable 'NvtxEnumRedSchema' [-Wunused-const-variable] 4 | static constexpr const nvtxPayloadEnum_t NvtxEnumRedSchema[] = { | ^~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1857:11: warning: unused variable 'stackSize' [-Wunused-variable] 1857 | int64_t stackSize; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1858:19: warning: unused variable 'devProp' [-Wunused-variable] 1858 | hipDeviceProp_t devProp; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2264:26: warning: unused variable 'payload' [-Wunused-variable] 2264 | NvtxParamsCommInitRank payload{myrank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2278:38: warning: unused variable 'CommInitAllSchema' [-Wunused-variable] 2278 | constexpr nvtxPayloadSchemaEntry_t CommInitAllSchema[] = { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2563:26: warning: unused variable 'payload' [-Wunused-variable] 2563 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2598:26: warning: unused variable 'payload' [-Wunused-variable] 2598 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2875:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 2875 | hipSetDevice(saveDevice); | ^~~~~~~~~~~~ ~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:39: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:40: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:86:21: warning: unused function 'commReclaim' [-Wunused-function] 86 | static ncclResult_t commReclaim(ncclComm_t comm); | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2249:36: warning: unused variable 'CommInitRankSchema' [-Wunused-const-variable] 2249 | constexpr nvtxPayloadSchemaEntry_t CommInitRankSchema[] = { | ^~~~~~~~~~~~~~~~~~ 58 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:4:42: warning: unused variable 'NvtxEnumRedSchema' [-Wunused-const-variable] 4 | static constexpr const nvtxPayloadEnum_t NvtxEnumRedSchema[] = { | ^~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:4:42: warning: unused variable 'NvtxEnumRedSchema' [-Wunused-const-variable] 4 | static constexpr const nvtxPayloadEnum_t NvtxEnumRedSchema[] = { | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ 2 warnings generated when compiling for gfx1101. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1857:11: warning: unused variable 'stackSize' [-Wunused-variable] 1857 | int64_t stackSize; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1858:19: warning: unused variable 'devProp' [-Wunused-variable] 1858 | hipDeviceProp_t devProp; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2264:26: warning: unused variable 'payload' [-Wunused-variable] 2264 | NvtxParamsCommInitRank payload{myrank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2278:38: warning: unused variable 'CommInitAllSchema' [-Wunused-variable] 2278 | constexpr nvtxPayloadSchemaEntry_t CommInitAllSchema[] = { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2563:26: warning: unused variable 'payload' [-Wunused-variable] 2563 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2598:26: warning: unused variable 'payload' [-Wunused-variable] 2598 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2875:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 2875 | hipSetDevice(saveDevice); | ^~~~~~~~~~~~ ~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:39: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:40: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:86:21: warning: unused function 'commReclaim' [-Wunused-function] 86 | static ncclResult_t commReclaim(ncclComm_t comm); | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2249:36: warning: unused variable 'CommInitRankSchema' [-Wunused-const-variable] 2249 | constexpr nvtxPayloadSchemaEntry_t CommInitRankSchema[] = { | ^~~~~~~~~~~~~~~~~~ 58 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:4:42: warning: unused variable 'NvtxEnumRedSchema' [-Wunused-const-variable] 4 | static constexpr const nvtxPayloadEnum_t NvtxEnumRedSchema[] = { | ^~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1857:11: warning: unused variable 'stackSize' [-Wunused-variable] 1857 | int64_t stackSize; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1858:19: warning: unused variable 'devProp' [-Wunused-variable] 1858 | hipDeviceProp_t devProp; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2264:26: warning: unused variable 'payload' [-Wunused-variable] 2264 | NvtxParamsCommInitRank payload{myrank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2278:38: warning: unused variable 'CommInitAllSchema' [-Wunused-variable] 2278 | constexpr nvtxPayloadSchemaEntry_t CommInitAllSchema[] = { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2563:26: warning: unused variable 'payload' [-Wunused-variable] 2563 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2598:26: warning: unused variable 'payload' [-Wunused-variable] 2598 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2875:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 2875 | hipSetDevice(saveDevice); | ^~~~~~~~~~~~ ~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:39: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:40: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:86:21: warning: unused function 'commReclaim' [-Wunused-function] 86 | static ncclResult_t commReclaim(ncclComm_t comm); | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2249:36: warning: unused variable 'CommInitRankSchema' [-Wunused-const-variable] 2249 | constexpr nvtxPayloadSchemaEntry_t CommInitRankSchema[] = { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:4:42: warning: unused variable 'NvtxEnumRedSchema' [-Wunused-const-variable] 4 | static constexpr const nvtxPayloadEnum_t NvtxEnumRedSchema[] = { | ^~~~~~~~~~~~~~~~~ 58 warnings generated when compiling for gfx1201. 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:4:42: warning: unused variable 'NvtxEnumRedSchema' [-Wunused-const-variable] 4 | static constexpr const nvtxPayloadEnum_t NvtxEnumRedSchema[] = { | ^~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1857:11: warning: unused variable 'stackSize' [-Wunused-variable] 1857 | int64_t stackSize; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1858:19: warning: unused variable 'devProp' [-Wunused-variable] 1858 | hipDeviceProp_t devProp; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2264:26: warning: unused variable 'payload' [-Wunused-variable] 2264 | NvtxParamsCommInitRank payload{myrank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2278:38: warning: unused variable 'CommInitAllSchema' [-Wunused-variable] 2278 | constexpr nvtxPayloadSchemaEntry_t CommInitAllSchema[] = { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2563:26: warning: unused variable 'payload' [-Wunused-variable] 2563 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2598:26: warning: unused variable 'payload' [-Wunused-variable] 2598 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2875:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 2875 | hipSetDevice(saveDevice); | ^~~~~~~~~~~~ ~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:39: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:40: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:86:21: warning: unused function 'commReclaim' [-Wunused-function] 86 | static ncclResult_t commReclaim(ncclComm_t comm); | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2249:36: warning: unused variable 'CommInitRankSchema' [-Wunused-const-variable] 2249 | constexpr nvtxPayloadSchemaEntry_t CommInitRankSchema[] = { | ^~~~~~~~~~~~~~~~~~ 58 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:4:42: warning: unused variable 'NvtxEnumRedSchema' [-Wunused-const-variable] 4 | static constexpr const nvtxPayloadEnum_t NvtxEnumRedSchema[] = { | ^~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1857:11: warning: unused variable 'stackSize' [-Wunused-variable] 1857 | int64_t stackSize; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1858:19: warning: unused variable 'devProp' [-Wunused-variable] 1858 | hipDeviceProp_t devProp; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2264:26: warning: unused variable 'payload' [-Wunused-variable] 2264 | NvtxParamsCommInitRank payload{myrank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2278:38: warning: unused variable 'CommInitAllSchema' [-Wunused-variable] 2278 | constexpr nvtxPayloadSchemaEntry_t CommInitAllSchema[] = { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2563:26: warning: unused variable 'payload' [-Wunused-variable] 2563 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2598:26: warning: unused variable 'payload' [-Wunused-variable] 2598 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2875:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 2875 | hipSetDevice(saveDevice); | ^~~~~~~~~~~~ ~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:4:42: warning: unused variable 'NvtxEnumRedSchema' [-Wunused-const-variable] 4 | static constexpr const nvtxPayloadEnum_t NvtxEnumRedSchema[] = { | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:39: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:40: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:86:21: warning: unused function 'commReclaim' [-Wunused-function] 86 | static ncclResult_t commReclaim(ncclComm_t comm); | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2249:36: warning: unused variable 'CommInitRankSchema' [-Wunused-const-variable] 2249 | constexpr nvtxPayloadSchemaEntry_t CommInitRankSchema[] = { | ^~~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx908. 58 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:4:42: warning: unused variable 'NvtxEnumRedSchema' [-Wunused-const-variable] 4 | static constexpr const nvtxPayloadEnum_t NvtxEnumRedSchema[] = { | ^~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1857:11: warning: unused variable 'stackSize' [-Wunused-variable] 1857 | int64_t stackSize; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1858:19: warning: unused variable 'devProp' [-Wunused-variable] 1858 | hipDeviceProp_t devProp; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2264:26: warning: unused variable 'payload' [-Wunused-variable] 2264 | NvtxParamsCommInitRank payload{myrank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2278:38: warning: unused variable 'CommInitAllSchema' [-Wunused-variable] 2278 | constexpr nvtxPayloadSchemaEntry_t CommInitAllSchema[] = { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2563:26: warning: unused variable 'payload' [-Wunused-variable] 2563 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2598:26: warning: unused variable 'payload' [-Wunused-variable] 2598 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2875:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 2875 | hipSetDevice(saveDevice); | ^~~~~~~~~~~~ ~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:39: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:40: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:86:21: warning: unused function 'commReclaim' [-Wunused-function] 86 | static ncclResult_t commReclaim(ncclComm_t comm); | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2249:36: warning: unused variable 'CommInitRankSchema' [-Wunused-const-variable] 2249 | constexpr nvtxPayloadSchemaEntry_t CommInitRankSchema[] = { | ^~~~~~~~~~~~~~~~~~ 58 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:4:42: warning: unused variable 'NvtxEnumRedSchema' [-Wunused-const-variable] 4 | static constexpr const nvtxPayloadEnum_t NvtxEnumRedSchema[] = { | ^~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/nvtx.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init_nvtx.cc:4:42: warning: unused variable 'NvtxEnumRedSchema' [-Wunused-const-variable] 4 | static constexpr const nvtxPayloadEnum_t NvtxEnumRedSchema[] = { | ^~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for host. [ 44%] Building CXX object CMakeFiles/rccl.dir/hipify/src/net.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/net.cc.o -MF CMakeFiles/rccl.dir/hipify/src/net.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/net.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1857:11: warning: unused variable 'stackSize' [-Wunused-variable] 1857 | int64_t stackSize; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1858:19: warning: unused variable 'devProp' [-Wunused-variable] 1858 | hipDeviceProp_t devProp; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2264:26: warning: unused variable 'payload' [-Wunused-variable] 2264 | NvtxParamsCommInitRank payload{myrank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2278:38: warning: unused variable 'CommInitAllSchema' [-Wunused-variable] 2278 | constexpr nvtxPayloadSchemaEntry_t CommInitAllSchema[] = { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2563:26: warning: unused variable 'payload' [-Wunused-variable] 2563 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2598:26: warning: unused variable 'payload' [-Wunused-variable] 2598 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2875:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 2875 | hipSetDevice(saveDevice); | ^~~~~~~~~~~~ ~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:39: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:40: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:86:21: warning: unused function 'commReclaim' [-Wunused-function] 86 | static ncclResult_t commReclaim(ncclComm_t comm); | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2249:36: warning: unused variable 'CommInitRankSchema' [-Wunused-const-variable] 2249 | constexpr nvtxPayloadSchemaEntry_t CommInitRankSchema[] = { | ^~~~~~~~~~~~~~~~~~ 58 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1857:11: warning: unused variable 'stackSize' [-Wunused-variable] 1857 | int64_t stackSize; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:1858:19: warning: unused variable 'devProp' [-Wunused-variable] 1858 | hipDeviceProp_t devProp; | ^~~~~~~ 2 warnings generated when compiling for gfx1100. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2264:26: warning: unused variable 'payload' [-Wunused-variable] 2264 | NvtxParamsCommInitRank payload{myrank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2278:38: warning: unused variable 'CommInitAllSchema' [-Wunused-variable] 2278 | constexpr nvtxPayloadSchemaEntry_t CommInitAllSchema[] = { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2563:26: warning: unused variable 'payload' [-Wunused-variable] 2563 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2598:26: warning: unused variable 'payload' [-Wunused-variable] 2598 | NvtxParamsCommInitRank payload{rank, nranks, cudaDev}; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2875:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 2875 | hipSetDevice(saveDevice); | ^~~~~~~~~~~~ ~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:39: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:40: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:86:21: warning: unused function 'commReclaim' [-Wunused-function] 86 | static ncclResult_t commReclaim(ncclComm_t comm); | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/init.cc:2249:36: warning: unused variable 'CommInitRankSchema' [-Wunused-const-variable] 2249 | constexpr nvtxPayloadSchemaEntry_t CommInitRankSchema[] = { | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. 58 warnings generated when compiling for host. [ 44%] Building CXX object CMakeFiles/rccl.dir/hipify/src/msccl.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/msccl.cc.o -MF CMakeFiles/rccl.dir/hipify/src/msccl.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/msccl.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:54:38: warning: unused variable 'MscclSchema' [-Wunused-variable] 54 | constexpr nvtxPayloadSchemaEntry_t MscclSchema[] = { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:61:19: warning: unused variable 'payload' [-Wunused-variable] 61 | NvtxParamsMsccl payload{count * ncclTypeSize(dataType), op, dataType}; | ^~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ 7 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:54:38: warning: unused variable 'MscclSchema' [-Wunused-variable] 54 | constexpr nvtxPayloadSchemaEntry_t MscclSchema[] = { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:61:19: warning: unused variable 'payload' [-Wunused-variable] 61 | NvtxParamsMsccl payload{count * ncclTypeSize(dataType), op, dataType}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ 7 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx1201. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:54:38: warning: unused variable 'MscclSchema' [-Wunused-variable] 54 | constexpr nvtxPayloadSchemaEntry_t MscclSchema[] = { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:61:19: warning: unused variable 'payload' [-Wunused-variable] 61 | NvtxParamsMsccl payload{count * ncclTypeSize(dataType), op, dataType}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ 7 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:54:38: warning: unused variable 'MscclSchema' [-Wunused-variable] 54 | constexpr nvtxPayloadSchemaEntry_t MscclSchema[] = { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:61:19: warning: unused variable 'payload' [-Wunused-variable] 61 | NvtxParamsMsccl payload{count * ncclTypeSize(dataType), op, dataType}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ 7 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:54:38: warning: unused variable 'MscclSchema' [-Wunused-variable] 54 | constexpr nvtxPayloadSchemaEntry_t MscclSchema[] = { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:61:19: warning: unused variable 'payload' [-Wunused-variable] 61 | NvtxParamsMsccl payload{count * ncclTypeSize(dataType), op, dataType}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ 7 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:54:38: warning: unused variable 'MscclSchema' [-Wunused-variable] 54 | constexpr nvtxPayloadSchemaEntry_t MscclSchema[] = { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:61:19: warning: unused variable 'payload' [-Wunused-variable] 61 | NvtxParamsMsccl payload{count * ncclTypeSize(dataType), op, dataType}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ 7 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:54:38: warning: unused variable 'MscclSchema' [-Wunused-variable] 54 | constexpr nvtxPayloadSchemaEntry_t MscclSchema[] = { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:61:19: warning: unused variable 'payload' [-Wunused-variable] 61 | NvtxParamsMsccl payload{count * ncclTypeSize(dataType), op, dataType}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ 7 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/net.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 44%] Building CXX object CMakeFiles/rccl.dir/hipify/src/proxy.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/proxy.cc.o -MF CMakeFiles/rccl.dir/hipify/src/proxy.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/proxy.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:54:38: warning: unused variable 'MscclSchema' [-Wunused-variable] 54 | constexpr nvtxPayloadSchemaEntry_t MscclSchema[] = { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:61:19: warning: unused variable 'payload' [-Wunused-variable] 61 | NvtxParamsMsccl payload{count * ncclTypeSize(dataType), op, dataType}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ 7 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:289:7: warning: variable 'sublist_len' set but not used [-Wunused-but-set-variable] 289 | int sublist_len = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:1570:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 1570 | hipDeviceSynchronize(); | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 4 warnings generated when compiling for gfx1030. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:54:38: warning: unused variable 'MscclSchema' [-Wunused-variable] 54 | constexpr nvtxPayloadSchemaEntry_t MscclSchema[] = { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:61:19: warning: unused variable 'payload' [-Wunused-variable] 61 | NvtxParamsMsccl payload{count * ncclTypeSize(dataType), op, dataType}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ 7 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:289:7: warning: variable 'sublist_len' set but not used [-Wunused-but-set-variable] 289 | int sublist_len = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:1570:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 1570 | hipDeviceSynchronize(); | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 4 warnings generated when compiling for gfx1100. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:54:38: warning: unused variable 'MscclSchema' [-Wunused-variable] 54 | constexpr nvtxPayloadSchemaEntry_t MscclSchema[] = { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:61:19: warning: unused variable 'payload' [-Wunused-variable] 61 | NvtxParamsMsccl payload{count * ncclTypeSize(dataType), op, dataType}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ 7 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:289:7: warning: variable 'sublist_len' set but not used [-Wunused-but-set-variable] 289 | int sublist_len = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:1570:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 1570 | hipDeviceSynchronize(); | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 4 warnings generated when compiling for gfx1101. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:54:38: warning: unused variable 'MscclSchema' [-Wunused-variable] 54 | constexpr nvtxPayloadSchemaEntry_t MscclSchema[] = { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:61:19: warning: unused variable 'payload' [-Wunused-variable] 61 | NvtxParamsMsccl payload{count * ncclTypeSize(dataType), op, dataType}; | ^~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/enqueue.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/msccl.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ 7 warnings generated when compiling for host. [ 44%] Building CXX object CMakeFiles/rccl.dir/hipify/src/rccl_wrap.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/rccl_wrap.cc.o -MF CMakeFiles/rccl.dir/hipify/src/rccl_wrap.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/rccl_wrap.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:289:7: warning: variable 'sublist_len' set but not used [-Wunused-but-set-variable] 289 | int sublist_len = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:1570:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 1570 | hipDeviceSynchronize(); | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:25: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 4 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 9 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:289:7: warning: variable 'sublist_len' set but not used [-Wunused-but-set-variable] 289 | int sublist_len = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:1570:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 1570 | hipDeviceSynchronize(); | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:25: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 4 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 9 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:289:7: warning: variable 'sublist_len' set but not used [-Wunused-but-set-variable] 289 | int sublist_len = 0; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:25: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:1570:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 1570 | hipDeviceSynchronize(); | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 4 warnings generated when compiling for gfx1201. 9 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:25: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:289:7: warning: variable 'sublist_len' set but not used [-Wunused-but-set-variable] 289 | int sublist_len = 0; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:1570:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 1570 | hipDeviceSynchronize(); | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 9 warnings generated when compiling for gfx1102. 4 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:25: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:289:7: warning: variable 'sublist_len' set but not used [-Wunused-but-set-variable] 289 | int sublist_len = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:1570:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 1570 | hipDeviceSynchronize(); | ^~~~~~~~~~~~~~~~~~~~ 9 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 4 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:25: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 9 warnings generated when compiling for gfx1201. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:289:7: warning: variable 'sublist_len' set but not used [-Wunused-but-set-variable] 289 | int sublist_len = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:1570:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 1570 | hipDeviceSynchronize(); | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 4 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:25: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:289:7: warning: variable 'sublist_len' set but not used [-Wunused-but-set-variable] 289 | int sublist_len = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:1570:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 1570 | hipDeviceSynchronize(); | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 9 warnings generated when compiling for gfx906. 4 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:25: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:289:7: warning: variable 'sublist_len' set but not used [-Wunused-but-set-variable] 289 | int sublist_len = 0; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:1570:3: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 1570 | hipDeviceSynchronize(); | ^~~~~~~~~~~~~~~~~~~~ 9 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/proxy.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 4 warnings generated when compiling for host. [ 45%] Building CXX object CMakeFiles/rccl.dir/hipify/src/register.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/register.cc.o -MF CMakeFiles/rccl.dir/hipify/src/register.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/register.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:25: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 9 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:25: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 9 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:25: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/rccl_wrap.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 9 warnings generated when compiling for host. [ 45%] Building CXX object CMakeFiles/rccl.dir/hipify/src/transport.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/transport.cc.o -MF CMakeFiles/rccl.dir/hipify/src/transport.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/transport.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/register.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 45%] Building CXX object CMakeFiles/rccl.dir/hipify/src/device/common.cu.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/device/common.cu.cpp.o -MF CMakeFiles/rccl.dir/hipify/src/device/common.cu.cpp.o.d -o CMakeFiles/rccl.dir/hipify/src/device/common.cu.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 45%] Building CXX object CMakeFiles/rccl.dir/hipify/src/device/onerank.cu.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/device/onerank.cu.cpp.o -MF CMakeFiles/rccl.dir/hipify/src/device/onerank.cu.cpp.o.d -o CMakeFiles/rccl.dir/hipify/src/device/onerank.cu.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp 1 warning generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1100. 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx1102. 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx1200. 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx908. 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx90a. 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx942. 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.cu.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for host. [ 45%] Building CXX object CMakeFiles/rccl.dir/hipify/src/graph/connect.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/graph/connect.cc.o -MF CMakeFiles/rccl.dir/hipify/src/graph/connect.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/graph/connect.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:124:12: warning: unused variable 'y' [-Wunused-variable] 124 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:131:7: warning: unused variable 'localRanks' [-Wunused-variable] 131 | int localRanks = comm->topo->nodes[GPU].count; | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:265:21: warning: unused function 'getIndexes' [-Wunused-function] 265 | static ncclResult_t getIndexes(int* ranks, int* indexes, int nNodes) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:439:21: warning: unused function 'connectNvls' [-Wunused-function] 439 | static ncclResult_t connectNvls(struct ncclComm* comm, int* nvlsHeads, int nHeads) { | ^~~~~~~~~~~ 13 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:124:12: warning: unused variable 'y' [-Wunused-variable] 124 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:131:7: warning: unused variable 'localRanks' [-Wunused-variable] 131 | int localRanks = comm->topo->nodes[GPU].count; | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:265:21: warning: unused function 'getIndexes' [-Wunused-function] 265 | static ncclResult_t getIndexes(int* ranks, int* indexes, int nNodes) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:439:21: warning: unused function 'connectNvls' [-Wunused-function] 439 | static ncclResult_t connectNvls(struct ncclComm* comm, int* nvlsHeads, int nHeads) { | ^~~~~~~~~~~ 13 warnings generated when compiling for gfx1100. 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:124:12: warning: unused variable 'y' [-Wunused-variable] 124 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:131:7: warning: unused variable 'localRanks' [-Wunused-variable] 131 | int localRanks = comm->topo->nodes[GPU].count; | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:265:21: warning: unused function 'getIndexes' [-Wunused-function] 265 | static ncclResult_t getIndexes(int* ranks, int* indexes, int nNodes) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:439:21: warning: unused function 'connectNvls' [-Wunused-function] 439 | static ncclResult_t connectNvls(struct ncclComm* comm, int* nvlsHeads, int nHeads) { | ^~~~~~~~~~~ 13 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:124:12: warning: unused variable 'y' [-Wunused-variable] 124 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:131:7: warning: unused variable 'localRanks' [-Wunused-variable] 131 | int localRanks = comm->topo->nodes[GPU].count; | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:265:21: warning: unused function 'getIndexes' [-Wunused-function] 265 | static ncclResult_t getIndexes(int* ranks, int* indexes, int nNodes) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:439:21: warning: unused function 'connectNvls' [-Wunused-function] 439 | static ncclResult_t connectNvls(struct ncclComm* comm, int* nvlsHeads, int nHeads) { | ^~~~~~~~~~~ 13 warnings generated when compiling for gfx1102. 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:124:12: warning: unused variable 'y' [-Wunused-variable] 124 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:131:7: warning: unused variable 'localRanks' [-Wunused-variable] 131 | int localRanks = comm->topo->nodes[GPU].count; | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:265:21: warning: unused function 'getIndexes' [-Wunused-function] 265 | static ncclResult_t getIndexes(int* ranks, int* indexes, int nNodes) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:439:21: warning: unused function 'connectNvls' [-Wunused-function] 439 | static ncclResult_t connectNvls(struct ncclComm* comm, int* nvlsHeads, int nHeads) { | ^~~~~~~~~~~ 13 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/onerank.cu.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 46%] Building CXX object CMakeFiles/rccl.dir/hipify/src/graph/paths.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/graph/paths.cc.o -MF CMakeFiles/rccl.dir/hipify/src/graph/paths.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/graph/paths.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:124:12: warning: unused variable 'y' [-Wunused-variable] 124 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:131:7: warning: unused variable 'localRanks' [-Wunused-variable] 131 | int localRanks = comm->topo->nodes[GPU].count; | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:265:21: warning: unused function 'getIndexes' [-Wunused-function] 265 | static ncclResult_t getIndexes(int* ranks, int* indexes, int nNodes) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:439:21: warning: unused function 'connectNvls' [-Wunused-function] 439 | static ncclResult_t connectNvls(struct ncclComm* comm, int* nvlsHeads, int nHeads) { | ^~~~~~~~~~~ 13 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:275:7: warning: variable 'intermediateIndex' set but not used [-Wunused-but-set-variable] 275 | int intermediateIndex = -1; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:462:24: warning: unused variable 'gpu' [-Wunused-variable] 462 | struct ncclTopoNode* gpu = system->nodes[GPU].nodes+g; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:124:12: warning: unused variable 'y' [-Wunused-variable] 124 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:131:7: warning: unused variable 'localRanks' [-Wunused-variable] 131 | int localRanks = comm->topo->nodes[GPU].count; | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:265:21: warning: unused function 'getIndexes' [-Wunused-function] 265 | static ncclResult_t getIndexes(int* ranks, int* indexes, int nNodes) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:439:21: warning: unused function 'connectNvls' [-Wunused-function] 439 | static ncclResult_t connectNvls(struct ncclComm* comm, int* nvlsHeads, int nHeads) { | ^~~~~~~~~~~ 13 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:275:7: warning: variable 'intermediateIndex' set but not used [-Wunused-but-set-variable] 275 | int intermediateIndex = -1; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:462:24: warning: unused variable 'gpu' [-Wunused-variable] 462 | struct ncclTopoNode* gpu = system->nodes[GPU].nodes+g; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:124:12: warning: unused variable 'y' [-Wunused-variable] 124 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:131:7: warning: unused variable 'localRanks' [-Wunused-variable] 131 | int localRanks = comm->topo->nodes[GPU].count; | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:265:21: warning: unused function 'getIndexes' [-Wunused-function] 265 | static ncclResult_t getIndexes(int* ranks, int* indexes, int nNodes) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:439:21: warning: unused function 'connectNvls' [-Wunused-function] 439 | static ncclResult_t connectNvls(struct ncclComm* comm, int* nvlsHeads, int nHeads) { | ^~~~~~~~~~~ 13 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:275:7: warning: variable 'intermediateIndex' set but not used [-Wunused-but-set-variable] 275 | int intermediateIndex = -1; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:462:24: warning: unused variable 'gpu' [-Wunused-variable] 462 | struct ncclTopoNode* gpu = system->nodes[GPU].nodes+g; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:124:12: warning: unused variable 'y' [-Wunused-variable] 124 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:131:7: warning: unused variable 'localRanks' [-Wunused-variable] 131 | int localRanks = comm->topo->nodes[GPU].count; | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:265:21: warning: unused function 'getIndexes' [-Wunused-function] 265 | static ncclResult_t getIndexes(int* ranks, int* indexes, int nNodes) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:439:21: warning: unused function 'connectNvls' [-Wunused-function] 439 | static ncclResult_t connectNvls(struct ncclComm* comm, int* nvlsHeads, int nHeads) { | ^~~~~~~~~~~ 13 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:275:7: warning: variable 'intermediateIndex' set but not used [-Wunused-but-set-variable] 275 | int intermediateIndex = -1; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:462:24: warning: unused variable 'gpu' [-Wunused-variable] 462 | struct ncclTopoNode* gpu = system->nodes[GPU].nodes+g; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:124:12: warning: unused variable 'y' [-Wunused-variable] 124 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:131:7: warning: unused variable 'localRanks' [-Wunused-variable] 131 | int localRanks = comm->topo->nodes[GPU].count; | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:265:21: warning: unused function 'getIndexes' [-Wunused-function] 265 | static ncclResult_t getIndexes(int* ranks, int* indexes, int nNodes) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:439:21: warning: unused function 'connectNvls' [-Wunused-function] 439 | static ncclResult_t connectNvls(struct ncclComm* comm, int* nvlsHeads, int nHeads) { | ^~~~~~~~~~~ 13 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:275:7: warning: variable 'intermediateIndex' set but not used [-Wunused-but-set-variable] 275 | int intermediateIndex = -1; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:462:24: warning: unused variable 'gpu' [-Wunused-variable] 462 | struct ncclTopoNode* gpu = system->nodes[GPU].nodes+g; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:124:12: warning: unused variable 'y' [-Wunused-variable] 124 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:131:7: warning: unused variable 'localRanks' [-Wunused-variable] 131 | int localRanks = comm->topo->nodes[GPU].count; | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:265:21: warning: unused function 'getIndexes' [-Wunused-function] 265 | static ncclResult_t getIndexes(int* ranks, int* indexes, int nNodes) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/connect.cc:439:21: warning: unused function 'connectNvls' [-Wunused-function] 439 | static ncclResult_t connectNvls(struct ncclComm* comm, int* nvlsHeads, int nHeads) { | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:275:7: warning: variable 'intermediateIndex' set but not used [-Wunused-but-set-variable] 275 | int intermediateIndex = -1; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:462:24: warning: unused variable 'gpu' [-Wunused-variable] 462 | struct ncclTopoNode* gpu = system->nodes[GPU].nodes+g; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx1201. 13 warnings generated when compiling for host. [ 46%] Building CXX object CMakeFiles/rccl.dir/hipify/src/graph/rings.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/graph/rings.cc.o -MF CMakeFiles/rccl.dir/hipify/src/graph/rings.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/graph/rings.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:275:7: warning: variable 'intermediateIndex' set but not used [-Wunused-but-set-variable] 275 | int intermediateIndex = -1; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:462:24: warning: unused variable 'gpu' [-Wunused-variable] 462 | struct ncclTopoNode* gpu = system->nodes[GPU].nodes+g; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 31 warnings generated when compiling for gfx906. 1 warning generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:275:7: warning: variable 'intermediateIndex' set but not used [-Wunused-but-set-variable] 275 | int intermediateIndex = -1; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:462:24: warning: unused variable 'gpu' [-Wunused-variable] 462 | struct ncclTopoNode* gpu = system->nodes[GPU].nodes+g; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 1 warning generated when compiling for gfx1100. 31 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:275:7: warning: variable 'intermediateIndex' set but not used [-Wunused-but-set-variable] 275 | int intermediateIndex = -1; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:462:24: warning: unused variable 'gpu' [-Wunused-variable] 462 | struct ncclTopoNode* gpu = system->nodes[GPU].nodes+g; | ^~~ 1 warning generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:275:7: warning: variable 'intermediateIndex' set but not used [-Wunused-but-set-variable] 275 | int intermediateIndex = -1; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:462:24: warning: unused variable 'gpu' [-Wunused-variable] 462 | struct ncclTopoNode* gpu = system->nodes[GPU].nodes+g; | ^~~ 1 warning generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:275:7: warning: variable 'intermediateIndex' set but not used [-Wunused-but-set-variable] 275 | int intermediateIndex = -1; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:462:24: warning: unused variable 'gpu' [-Wunused-variable] 462 | struct ncclTopoNode* gpu = system->nodes[GPU].nodes+g; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/paths.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 31 warnings generated when compiling for host. [ 46%] Building CXX object CMakeFiles/rccl.dir/hipify/src/graph/rome_models.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/graph/rome_models.cc.o -MF CMakeFiles/rccl.dir/hipify/src/graph/rome_models.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/graph/rome_models.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1341:7: warning: unused variable 'nChannels' [-Wunused-variable] 1341 | int nChannels = 0; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1351:12: warning: unused variable 'y' [-Wunused-variable] 1351 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1858:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 1858 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1930:9: warning: unused variable 't' [-Wunused-variable] 1930 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2036:7: warning: unused variable 'ncpus' [-Wunused-variable] 2036 | int ncpus = system->nodes[CPU].count; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2130:9: warning: unused variable 't' [-Wunused-variable] 2130 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2240:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 2240 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2316:9: warning: unused variable 't' [-Wunused-variable] 2316 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:23: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:27: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 38 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1341:7: warning: unused variable 'nChannels' [-Wunused-variable] 1341 | int nChannels = 0; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1351:12: warning: unused variable 'y' [-Wunused-variable] 1351 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1858:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 1858 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1930:9: warning: unused variable 't' [-Wunused-variable] 1930 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2036:7: warning: unused variable 'ncpus' [-Wunused-variable] 2036 | int ncpus = system->nodes[CPU].count; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2130:9: warning: unused variable 't' [-Wunused-variable] 2130 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2240:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 2240 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2316:9: warning: unused variable 't' [-Wunused-variable] 2316 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:23: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:27: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 38 warnings generated when compiling for gfx1100. 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1341:7: warning: unused variable 'nChannels' [-Wunused-variable] 1341 | int nChannels = 0; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1351:12: warning: unused variable 'y' [-Wunused-variable] 1351 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1858:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 1858 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1930:9: warning: unused variable 't' [-Wunused-variable] 1930 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2036:7: warning: unused variable 'ncpus' [-Wunused-variable] 2036 | int ncpus = system->nodes[CPU].count; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2130:9: warning: unused variable 't' [-Wunused-variable] 2130 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2240:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2240 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2316:9: warning: unused variable 't' [-Wunused-variable] 2316 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:23: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:27: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 1 warning generated when compiling for gfx90a. 38 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx942. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1341:7: warning: unused variable 'nChannels' [-Wunused-variable] 1341 | int nChannels = 0; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1351:12: warning: unused variable 'y' [-Wunused-variable] 1351 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1858:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 1858 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1930:9: warning: unused variable 't' [-Wunused-variable] 1930 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2036:7: warning: unused variable 'ncpus' [-Wunused-variable] 2036 | int ncpus = system->nodes[CPU].count; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2130:9: warning: unused variable 't' [-Wunused-variable] 2130 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2240:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 2240 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2316:9: warning: unused variable 't' [-Wunused-variable] 2316 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:23: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:27: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 38 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rings.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for host. [ 46%] Building CXX object CMakeFiles/rccl.dir/hipify/src/graph/search.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/graph/search.cc.o -MF CMakeFiles/rccl.dir/hipify/src/graph/search.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/graph/search.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1341:7: warning: unused variable 'nChannels' [-Wunused-variable] 1341 | int nChannels = 0; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1351:12: warning: unused variable 'y' [-Wunused-variable] 1351 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1858:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 1858 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1930:9: warning: unused variable 't' [-Wunused-variable] 1930 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2036:7: warning: unused variable 'ncpus' [-Wunused-variable] 2036 | int ncpus = system->nodes[CPU].count; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2130:9: warning: unused variable 't' [-Wunused-variable] 2130 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2240:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 2240 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2316:9: warning: unused variable 't' [-Wunused-variable] 2316 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:23: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:27: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 38 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ 18 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1341:7: warning: unused variable 'nChannels' [-Wunused-variable] 1341 | int nChannels = 0; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1351:12: warning: unused variable 'y' [-Wunused-variable] 1351 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1858:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 1858 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1930:9: warning: unused variable 't' [-Wunused-variable] 1930 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2036:7: warning: unused variable 'ncpus' [-Wunused-variable] 2036 | int ncpus = system->nodes[CPU].count; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2130:9: warning: unused variable 't' [-Wunused-variable] 2130 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2240:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 2240 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2316:9: warning: unused variable 't' [-Wunused-variable] 2316 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:23: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:27: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 38 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ 18 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1341:7: warning: unused variable 'nChannels' [-Wunused-variable] 1341 | int nChannels = 0; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1351:12: warning: unused variable 'y' [-Wunused-variable] 1351 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1858:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 1858 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1930:9: warning: unused variable 't' [-Wunused-variable] 1930 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2036:7: warning: unused variable 'ncpus' [-Wunused-variable] 2036 | int ncpus = system->nodes[CPU].count; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2130:9: warning: unused variable 't' [-Wunused-variable] 2130 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2240:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 2240 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2316:9: warning: unused variable 't' [-Wunused-variable] 2316 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:23: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:27: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 38 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ 18 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1341:7: warning: unused variable 'nChannels' [-Wunused-variable] 1341 | int nChannels = 0; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1351:12: warning: unused variable 'y' [-Wunused-variable] 1351 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1858:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 1858 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1930:9: warning: unused variable 't' [-Wunused-variable] 1930 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2036:7: warning: unused variable 'ncpus' [-Wunused-variable] 2036 | int ncpus = system->nodes[CPU].count; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2130:9: warning: unused variable 't' [-Wunused-variable] 2130 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2240:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 2240 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2316:9: warning: unused variable 't' [-Wunused-variable] 2316 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:23: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:27: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 38 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ 18 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1341:7: warning: unused variable 'nChannels' [-Wunused-variable] 1341 | int nChannels = 0; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1351:12: warning: unused variable 'y' [-Wunused-variable] 1351 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1858:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 1858 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1930:9: warning: unused variable 't' [-Wunused-variable] 1930 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2036:7: warning: unused variable 'ncpus' [-Wunused-variable] 2036 | int ncpus = system->nodes[CPU].count; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2130:9: warning: unused variable 't' [-Wunused-variable] 2130 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2240:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 2240 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2316:9: warning: unused variable 't' [-Wunused-variable] 2316 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:23: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:27: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 38 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ 18 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1341:7: warning: unused variable 'nChannels' [-Wunused-variable] 1341 | int nChannels = 0; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1351:12: warning: unused variable 'y' [-Wunused-variable] 1351 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1858:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 1858 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1930:9: warning: unused variable 't' [-Wunused-variable] 1930 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2036:7: warning: unused variable 'ncpus' [-Wunused-variable] 2036 | int ncpus = system->nodes[CPU].count; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2130:9: warning: unused variable 't' [-Wunused-variable] 2130 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2240:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 2240 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2316:9: warning: unused variable 't' [-Wunused-variable] 2316 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:23: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:27: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 38 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ 18 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1341:7: warning: unused variable 'nChannels' [-Wunused-variable] 1341 | int nChannels = 0; | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1351:12: warning: unused variable 'y' [-Wunused-variable] 1351 | int x=0, y=0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1858:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 1858 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:1930:9: warning: unused variable 't' [-Wunused-variable] 1930 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2036:7: warning: unused variable 'ncpus' [-Wunused-variable] 2036 | int ncpus = system->nodes[CPU].count; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2130:9: warning: unused variable 't' [-Wunused-variable] 2130 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2240:7: warning: variable 'gcnt' set but not used [-Wunused-but-set-variable] 2240 | int gcnt = 0; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:2316:9: warning: unused variable 't' [-Wunused-variable] 2316 | float t = (tve.tv_sec - tvs.tv_sec)*1E3 + (tve.tv_usec - tvs.tv_usec)/1E3; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:23: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:26: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/rome_models.cc:27: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ 18 warnings generated when compiling for gfx906. 38 warnings generated when compiling for host. [ 47%] Building CXX object CMakeFiles/rccl.dir/hipify/src/graph/topo.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/graph/topo.cc.o -MF CMakeFiles/rccl.dir/hipify/src/graph/topo.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/graph/topo.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ 18 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 30 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ 18 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 30 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ 18 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 30 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/search.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 18 warnings generated when compiling for host. [ 47%] Building CXX object CMakeFiles/rccl.dir/hipify/src/graph/trees.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/graph/trees.cc.o -MF CMakeFiles/rccl.dir/hipify/src/graph/trees.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/graph/trees.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/trees.cc 30 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 30 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 30 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 30 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 30 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 30 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 30 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:11: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:16:20: warning: unused function 'collNetName' [-Wunused-function] 16 | static const char* collNetName(struct ncclComm* comm) { return comm->ncclCollNet->name; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:21:21: warning: unused function 'collNetReduceSupport' [-Wunused-function] 21 | static ncclResult_t collNetReduceSupport(struct ncclComm* comm, ncclDataType_t dataType, ncclRedOp_t redOp, int* supported) { NCCLCHECK(comm->ncclCollNet->reduceSupport(dataType, redOp, supported)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 30 warnings generated when compiling for host. [ 47%] Building CXX object CMakeFiles/rccl.dir/hipify/src/graph/tuning.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/graph/tuning.cc.o -MF CMakeFiles/rccl.dir/hipify/src/graph/tuning.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/graph/tuning.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:3: warning: nested designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:17: warning: array designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:354:10: warning: unused variable 'llMaxBw' [-Wunused-variable] 354 | double llMaxBw = llMaxBws[index1][index2]; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:355:10: warning: unused variable 'perChMaxTreeBw' [-Wunused-variable] 355 | double perChMaxTreeBw = perChMaxTreeBws[compCapIndex][index2]; | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:356:10: warning: unused variable 'perChMaxRingLL128Bw' [-Wunused-variable] 356 | double perChMaxRingLL128Bw = perChMaxRingLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:357:10: warning: unused variable 'perChMaxTreeLL128Bw' [-Wunused-variable] 357 | double perChMaxTreeLL128Bw = perChMaxTreeLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:360:9: warning: unused variable 'ppn' [-Wunused-variable] 360 | float ppn = (float)nRanks / nNodes; // if ppn < 2, then we are sending/receiving at the same GPU through the NIC, apply some bw discount | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:648:14: warning: unused variable 'treeCorrectionFactor' [-Wunused-variable] 648 | static float treeCorrectionFactor[NCCL_NUM_PROTOCOLS][23] = { | ^~~~~~~~~~~~~~~~~~~~ 27 warnings generated when compiling for gfx1030. [ 47%] Building CXX object CMakeFiles/rccl.dir/hipify/src/graph/xml.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/graph/xml.cc.o -MF CMakeFiles/rccl.dir/hipify/src/graph/xml.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/graph/xml.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:3: warning: nested designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:17: warning: array designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:354:10: warning: unused variable 'llMaxBw' [-Wunused-variable] 354 | double llMaxBw = llMaxBws[index1][index2]; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:355:10: warning: unused variable 'perChMaxTreeBw' [-Wunused-variable] 355 | double perChMaxTreeBw = perChMaxTreeBws[compCapIndex][index2]; | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:356:10: warning: unused variable 'perChMaxRingLL128Bw' [-Wunused-variable] 356 | double perChMaxRingLL128Bw = perChMaxRingLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:357:10: warning: unused variable 'perChMaxTreeLL128Bw' [-Wunused-variable] 357 | double perChMaxTreeLL128Bw = perChMaxTreeLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:360:9: warning: unused variable 'ppn' [-Wunused-variable] 360 | float ppn = (float)nRanks / nNodes; // if ppn < 2, then we are sending/receiving at the same GPU through the NIC, apply some bw discount | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:648:14: warning: unused variable 'treeCorrectionFactor' [-Wunused-variable] 648 | static float treeCorrectionFactor[NCCL_NUM_PROTOCOLS][23] = { | ^~~~~~~~~~~~~~~~~~~~ 27 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:3: warning: nested designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:17: warning: array designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:354:10: warning: unused variable 'llMaxBw' [-Wunused-variable] 354 | double llMaxBw = llMaxBws[index1][index2]; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:355:10: warning: unused variable 'perChMaxTreeBw' [-Wunused-variable] 355 | double perChMaxTreeBw = perChMaxTreeBws[compCapIndex][index2]; | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:356:10: warning: unused variable 'perChMaxRingLL128Bw' [-Wunused-variable] 356 | double perChMaxRingLL128Bw = perChMaxRingLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:357:10: warning: unused variable 'perChMaxTreeLL128Bw' [-Wunused-variable] 357 | double perChMaxTreeLL128Bw = perChMaxTreeLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:360:9: warning: unused variable 'ppn' [-Wunused-variable] 360 | float ppn = (float)nRanks / nNodes; // if ppn < 2, then we are sending/receiving at the same GPU through the NIC, apply some bw discount | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:648:14: warning: unused variable 'treeCorrectionFactor' [-Wunused-variable] 648 | static float treeCorrectionFactor[NCCL_NUM_PROTOCOLS][23] = { | ^~~~~~~~~~~~~~~~~~~~ 27 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:3: warning: nested designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:17: warning: array designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:354:10: warning: unused variable 'llMaxBw' [-Wunused-variable] 354 | double llMaxBw = llMaxBws[index1][index2]; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:355:10: warning: unused variable 'perChMaxTreeBw' [-Wunused-variable] 355 | double perChMaxTreeBw = perChMaxTreeBws[compCapIndex][index2]; | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:356:10: warning: unused variable 'perChMaxRingLL128Bw' [-Wunused-variable] 356 | double perChMaxRingLL128Bw = perChMaxRingLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:357:10: warning: unused variable 'perChMaxTreeLL128Bw' [-Wunused-variable] 357 | double perChMaxTreeLL128Bw = perChMaxTreeLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:360:9: warning: unused variable 'ppn' [-Wunused-variable] 360 | float ppn = (float)nRanks / nNodes; // if ppn < 2, then we are sending/receiving at the same GPU through the NIC, apply some bw discount | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:648:14: warning: unused variable 'treeCorrectionFactor' [-Wunused-variable] 648 | static float treeCorrectionFactor[NCCL_NUM_PROTOCOLS][23] = { | ^~~~~~~~~~~~~~~~~~~~ 27 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:3: warning: nested designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:17: warning: array designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:354:10: warning: unused variable 'llMaxBw' [-Wunused-variable] 354 | double llMaxBw = llMaxBws[index1][index2]; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:355:10: warning: unused variable 'perChMaxTreeBw' [-Wunused-variable] 355 | double perChMaxTreeBw = perChMaxTreeBws[compCapIndex][index2]; | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:356:10: warning: unused variable 'perChMaxRingLL128Bw' [-Wunused-variable] 356 | double perChMaxRingLL128Bw = perChMaxRingLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:357:10: warning: unused variable 'perChMaxTreeLL128Bw' [-Wunused-variable] 357 | double perChMaxTreeLL128Bw = perChMaxTreeLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:360:9: warning: unused variable 'ppn' [-Wunused-variable] 360 | float ppn = (float)nRanks / nNodes; // if ppn < 2, then we are sending/receiving at the same GPU through the NIC, apply some bw discount | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:648:14: warning: unused variable 'treeCorrectionFactor' [-Wunused-variable] 648 | static float treeCorrectionFactor[NCCL_NUM_PROTOCOLS][23] = { | ^~~~~~~~~~~~~~~~~~~~ 27 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:3: warning: nested designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:17: warning: array designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:354:10: warning: unused variable 'llMaxBw' [-Wunused-variable] 354 | double llMaxBw = llMaxBws[index1][index2]; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:355:10: warning: unused variable 'perChMaxTreeBw' [-Wunused-variable] 355 | double perChMaxTreeBw = perChMaxTreeBws[compCapIndex][index2]; | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:356:10: warning: unused variable 'perChMaxRingLL128Bw' [-Wunused-variable] 356 | double perChMaxRingLL128Bw = perChMaxRingLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:357:10: warning: unused variable 'perChMaxTreeLL128Bw' [-Wunused-variable] 357 | double perChMaxTreeLL128Bw = perChMaxTreeLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:360:9: warning: unused variable 'ppn' [-Wunused-variable] 360 | float ppn = (float)nRanks / nNodes; // if ppn < 2, then we are sending/receiving at the same GPU through the NIC, apply some bw discount | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:648:14: warning: unused variable 'treeCorrectionFactor' [-Wunused-variable] 648 | static float treeCorrectionFactor[NCCL_NUM_PROTOCOLS][23] = { | ^~~~~~~~~~~~~~~~~~~~ 27 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:3: warning: nested designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:17: warning: array designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:354:10: warning: unused variable 'llMaxBw' [-Wunused-variable] 354 | double llMaxBw = llMaxBws[index1][index2]; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:355:10: warning: unused variable 'perChMaxTreeBw' [-Wunused-variable] 355 | double perChMaxTreeBw = perChMaxTreeBws[compCapIndex][index2]; | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:356:10: warning: unused variable 'perChMaxRingLL128Bw' [-Wunused-variable] 356 | double perChMaxRingLL128Bw = perChMaxRingLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:357:10: warning: unused variable 'perChMaxTreeLL128Bw' [-Wunused-variable] 357 | double perChMaxTreeLL128Bw = perChMaxTreeLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:360:9: warning: unused variable 'ppn' [-Wunused-variable] 360 | float ppn = (float)nRanks / nNodes; // if ppn < 2, then we are sending/receiving at the same GPU through the NIC, apply some bw discount | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:648:14: warning: unused variable 'treeCorrectionFactor' [-Wunused-variable] 648 | static float treeCorrectionFactor[NCCL_NUM_PROTOCOLS][23] = { | ^~~~~~~~~~~~~~~~~~~~ 27 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:3: warning: nested designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:17: warning: array designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:354:10: warning: unused variable 'llMaxBw' [-Wunused-variable] 354 | double llMaxBw = llMaxBws[index1][index2]; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:355:10: warning: unused variable 'perChMaxTreeBw' [-Wunused-variable] 355 | double perChMaxTreeBw = perChMaxTreeBws[compCapIndex][index2]; | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:356:10: warning: unused variable 'perChMaxRingLL128Bw' [-Wunused-variable] 356 | double perChMaxRingLL128Bw = perChMaxRingLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:357:10: warning: unused variable 'perChMaxTreeLL128Bw' [-Wunused-variable] 357 | double perChMaxTreeLL128Bw = perChMaxTreeLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:360:9: warning: unused variable 'ppn' [-Wunused-variable] 360 | float ppn = (float)nRanks / nNodes; // if ppn < 2, then we are sending/receiving at the same GPU through the NIC, apply some bw discount | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:648:14: warning: unused variable 'treeCorrectionFactor' [-Wunused-variable] 648 | static float treeCorrectionFactor[NCCL_NUM_PROTOCOLS][23] = { | ^~~~~~~~~~~~~~~~~~~~ 27 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:3: warning: nested designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:17: warning: array designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:354:10: warning: unused variable 'llMaxBw' [-Wunused-variable] 354 | double llMaxBw = llMaxBws[index1][index2]; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:355:10: warning: unused variable 'perChMaxTreeBw' [-Wunused-variable] 355 | double perChMaxTreeBw = perChMaxTreeBws[compCapIndex][index2]; | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:356:10: warning: unused variable 'perChMaxRingLL128Bw' [-Wunused-variable] 356 | double perChMaxRingLL128Bw = perChMaxRingLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:357:10: warning: unused variable 'perChMaxTreeLL128Bw' [-Wunused-variable] 357 | double perChMaxTreeLL128Bw = perChMaxTreeLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:360:9: warning: unused variable 'ppn' [-Wunused-variable] 360 | float ppn = (float)nRanks / nNodes; // if ppn < 2, then we are sending/receiving at the same GPU through the NIC, apply some bw discount | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:648:14: warning: unused variable 'treeCorrectionFactor' [-Wunused-variable] 648 | static float treeCorrectionFactor[NCCL_NUM_PROTOCOLS][23] = { | ^~~~~~~~~~~~~~~~~~~~ 27 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:3: warning: nested designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:17: warning: array designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:354:10: warning: unused variable 'llMaxBw' [-Wunused-variable] 354 | double llMaxBw = llMaxBws[index1][index2]; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:355:10: warning: unused variable 'perChMaxTreeBw' [-Wunused-variable] 355 | double perChMaxTreeBw = perChMaxTreeBws[compCapIndex][index2]; | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:356:10: warning: unused variable 'perChMaxRingLL128Bw' [-Wunused-variable] 356 | double perChMaxRingLL128Bw = perChMaxRingLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:357:10: warning: unused variable 'perChMaxTreeLL128Bw' [-Wunused-variable] 357 | double perChMaxTreeLL128Bw = perChMaxTreeLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:360:9: warning: unused variable 'ppn' [-Wunused-variable] 360 | float ppn = (float)nRanks / nNodes; // if ppn < 2, then we are sending/receiving at the same GPU through the NIC, apply some bw discount | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:648:14: warning: unused variable 'treeCorrectionFactor' [-Wunused-variable] 648 | static float treeCorrectionFactor[NCCL_NUM_PROTOCOLS][23] = { | ^~~~~~~~~~~~~~~~~~~~ 27 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:107:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 107 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:139:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 139 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:171:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 171 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:203:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 203 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:235:21: warning: suggest braces around initialization of subobject [-Wmissing-braces] 235 | .llProtoRanges = {RCCL_LL_LIMITS_UNDEFINED}, | ^~~~~~~~~~~~~~~~~~~~~~~~ | { } /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_common.h:35:34: note: expanded from macro 'RCCL_LL_LIMITS_UNDEFINED' 35 | #define RCCL_LL_LIMITS_UNDEFINED 0 | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:3: warning: nested designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:267:17: warning: array designators are a C99 extension [-Wc99-designator] 267 | .llProtoRanges[RCCL_RS_TUNABLE] = /*ReduceScatter*/ {/* LL (Min/Max) */ {0, 655360, 1} , /* LL128 (Min/Max) */ {131072, 3211264, 1}}, | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:354:10: warning: unused variable 'llMaxBw' [-Wunused-variable] 354 | double llMaxBw = llMaxBws[index1][index2]; | ^~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:355:10: warning: unused variable 'perChMaxTreeBw' [-Wunused-variable] 355 | double perChMaxTreeBw = perChMaxTreeBws[compCapIndex][index2]; | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:356:10: warning: unused variable 'perChMaxRingLL128Bw' [-Wunused-variable] 356 | double perChMaxRingLL128Bw = perChMaxRingLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:357:10: warning: unused variable 'perChMaxTreeLL128Bw' [-Wunused-variable] 357 | double perChMaxTreeLL128Bw = perChMaxTreeLL128Bws[compCapIndex][index2]; | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:360:9: warning: unused variable 'ppn' [-Wunused-variable] 360 | float ppn = (float)nRanks / nNodes; // if ppn < 2, then we are sending/receiving at the same GPU through the NIC, apply some bw discount | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/tuning.cc:648:14: warning: unused variable 'treeCorrectionFactor' [-Wunused-variable] 648 | static float treeCorrectionFactor[NCCL_NUM_PROTOCOLS][23] = { | ^~~~~~~~~~~~~~~~~~~~ 27 warnings generated when compiling for host. [ 48%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/alt_rsmi.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/alt_rsmi.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/alt_rsmi.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/alt_rsmi.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 10 warnings generated when compiling for gfx942. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:105:33: warning: bitwise negation of a boolean expression always evaluates to 'true'; did you mean logical negation? [-Wbool-operation] 105 | if (ret_gpu_id == 0 && ~(ret_unique_id != 0 || ret_loc_id != 0 || ret_unique_id != 0 || ret_vendor != 0) && | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ! /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:103:13: warning: unused variable 'ret_domain' [-Wunused-variable] 103 | int ret_domain = read_node_properties(node_id, "domain", &domain, properties); | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:233:22: warning: unused variable 'hops' [-Wunused-variable] 233 | uint64_t hops; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:70:14: warning: unused variable 'count' [-Wunused-variable] 70 | uint32_t count = 0; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:52:20: warning: unused variable 'kPathDRMRoot' [-Wunused-variable] 52 | static const char *kPathDRMRoot = "/sys/class/drm"; | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:559:13: warning: unused function 'fileExists' [-Wunused-function] 559 | static bool fileExists(char const *filename) | ^~~~~~~~~~ 6 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.cc:17: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 10 warnings generated when compiling for host. [ 48%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/archinfo.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/archinfo.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/archinfo.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/archinfo.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/archinfo.cc /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:105:33: warning: bitwise negation of a boolean expression always evaluates to 'true'; did you mean logical negation? [-Wbool-operation] 105 | if (ret_gpu_id == 0 && ~(ret_unique_id != 0 || ret_loc_id != 0 || ret_unique_id != 0 || ret_vendor != 0) && | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ! /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:103:13: warning: unused variable 'ret_domain' [-Wunused-variable] 103 | int ret_domain = read_node_properties(node_id, "domain", &domain, properties); | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:233:22: warning: unused variable 'hops' [-Wunused-variable] 233 | uint64_t hops; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:70:14: warning: unused variable 'count' [-Wunused-variable] 70 | uint32_t count = 0; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:52:20: warning: unused variable 'kPathDRMRoot' [-Wunused-variable] 52 | static const char *kPathDRMRoot = "/sys/class/drm"; | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:559:13: warning: unused function 'fileExists' [-Wunused-function] 559 | static bool fileExists(char const *filename) | ^~~~~~~~~~ 6 warnings generated when compiling for gfx1100. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:105:33: warning: bitwise negation of a boolean expression always evaluates to 'true'; did you mean logical negation? [-Wbool-operation] 105 | if (ret_gpu_id == 0 && ~(ret_unique_id != 0 || ret_loc_id != 0 || ret_unique_id != 0 || ret_vendor != 0) && | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ! /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:103:13: warning: unused variable 'ret_domain' [-Wunused-variable] 103 | int ret_domain = read_node_properties(node_id, "domain", &domain, properties); | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:233:22: warning: unused variable 'hops' [-Wunused-variable] 233 | uint64_t hops; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:70:14: warning: unused variable 'count' [-Wunused-variable] 70 | uint32_t count = 0; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:52:20: warning: unused variable 'kPathDRMRoot' [-Wunused-variable] 52 | static const char *kPathDRMRoot = "/sys/class/drm"; | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:559:13: warning: unused function 'fileExists' [-Wunused-function] 559 | static bool fileExists(char const *filename) | ^~~~~~~~~~ 6 warnings generated when compiling for gfx1101. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:105:33: warning: bitwise negation of a boolean expression always evaluates to 'true'; did you mean logical negation? [-Wbool-operation] 105 | if (ret_gpu_id == 0 && ~(ret_unique_id != 0 || ret_loc_id != 0 || ret_unique_id != 0 || ret_vendor != 0) && | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ! /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:103:13: warning: unused variable 'ret_domain' [-Wunused-variable] 103 | int ret_domain = read_node_properties(node_id, "domain", &domain, properties); | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:233:22: warning: unused variable 'hops' [-Wunused-variable] 233 | uint64_t hops; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:70:14: warning: unused variable 'count' [-Wunused-variable] 70 | uint32_t count = 0; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:52:20: warning: unused variable 'kPathDRMRoot' [-Wunused-variable] 52 | static const char *kPathDRMRoot = "/sys/class/drm"; | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:559:13: warning: unused function 'fileExists' [-Wunused-function] 559 | static bool fileExists(char const *filename) | ^~~~~~~~~~ 6 warnings generated when compiling for gfx1102. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:105:33: warning: bitwise negation of a boolean expression always evaluates to 'true'; did you mean logical negation? [-Wbool-operation] 105 | if (ret_gpu_id == 0 && ~(ret_unique_id != 0 || ret_loc_id != 0 || ret_unique_id != 0 || ret_vendor != 0) && | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ! /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:103:13: warning: unused variable 'ret_domain' [-Wunused-variable] 103 | int ret_domain = read_node_properties(node_id, "domain", &domain, properties); | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:233:22: warning: unused variable 'hops' [-Wunused-variable] 233 | uint64_t hops; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:70:14: warning: unused variable 'count' [-Wunused-variable] 70 | uint32_t count = 0; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:52:20: warning: unused variable 'kPathDRMRoot' [-Wunused-variable] 52 | static const char *kPathDRMRoot = "/sys/class/drm"; | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:559:13: warning: unused function 'fileExists' [-Wunused-function] 559 | static bool fileExists(char const *filename) | ^~~~~~~~~~ 6 warnings generated when compiling for gfx1200. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:105:33: warning: bitwise negation of a boolean expression always evaluates to 'true'; did you mean logical negation? [-Wbool-operation] 105 | if (ret_gpu_id == 0 && ~(ret_unique_id != 0 || ret_loc_id != 0 || ret_unique_id != 0 || ret_vendor != 0) && | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ! /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:103:13: warning: unused variable 'ret_domain' [-Wunused-variable] 103 | int ret_domain = read_node_properties(node_id, "domain", &domain, properties); | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:233:22: warning: unused variable 'hops' [-Wunused-variable] 233 | uint64_t hops; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:70:14: warning: unused variable 'count' [-Wunused-variable] 70 | uint32_t count = 0; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:52:20: warning: unused variable 'kPathDRMRoot' [-Wunused-variable] 52 | static const char *kPathDRMRoot = "/sys/class/drm"; | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:559:13: warning: unused function 'fileExists' [-Wunused-function] 559 | static bool fileExists(char const *filename) | ^~~~~~~~~~ 6 warnings generated when compiling for gfx1201. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:105:33: warning: bitwise negation of a boolean expression always evaluates to 'true'; did you mean logical negation? [-Wbool-operation] 105 | if (ret_gpu_id == 0 && ~(ret_unique_id != 0 || ret_loc_id != 0 || ret_unique_id != 0 || ret_vendor != 0) && | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ! /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:103:13: warning: unused variable 'ret_domain' [-Wunused-variable] 103 | int ret_domain = read_node_properties(node_id, "domain", &domain, properties); | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:233:22: warning: unused variable 'hops' [-Wunused-variable] 233 | uint64_t hops; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:70:14: warning: unused variable 'count' [-Wunused-variable] 70 | uint32_t count = 0; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:52:20: warning: unused variable 'kPathDRMRoot' [-Wunused-variable] 52 | static const char *kPathDRMRoot = "/sys/class/drm"; | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:559:13: warning: unused function 'fileExists' [-Wunused-function] 559 | static bool fileExists(char const *filename) | ^~~~~~~~~~ 6 warnings generated when compiling for gfx906. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:105:33: warning: bitwise negation of a boolean expression always evaluates to 'true'; did you mean logical negation? [-Wbool-operation] 105 | if (ret_gpu_id == 0 && ~(ret_unique_id != 0 || ret_loc_id != 0 || ret_unique_id != 0 || ret_vendor != 0) && | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ! /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:103:13: warning: unused variable 'ret_domain' [-Wunused-variable] 103 | int ret_domain = read_node_properties(node_id, "domain", &domain, properties); | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:233:22: warning: unused variable 'hops' [-Wunused-variable] 233 | uint64_t hops; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:70:14: warning: unused variable 'count' [-Wunused-variable] 70 | uint32_t count = 0; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:52:20: warning: unused variable 'kPathDRMRoot' [-Wunused-variable] 52 | static const char *kPathDRMRoot = "/sys/class/drm"; | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:559:13: warning: unused function 'fileExists' [-Wunused-function] 559 | static bool fileExists(char const *filename) | ^~~~~~~~~~ 6 warnings generated when compiling for gfx908. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:105:33: warning: bitwise negation of a boolean expression always evaluates to 'true'; did you mean logical negation? [-Wbool-operation] 105 | if (ret_gpu_id == 0 && ~(ret_unique_id != 0 || ret_loc_id != 0 || ret_unique_id != 0 || ret_vendor != 0) && | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ! /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:103:13: warning: unused variable 'ret_domain' [-Wunused-variable] 103 | int ret_domain = read_node_properties(node_id, "domain", &domain, properties); | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:233:22: warning: unused variable 'hops' [-Wunused-variable] 233 | uint64_t hops; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:70:14: warning: unused variable 'count' [-Wunused-variable] 70 | uint32_t count = 0; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:52:20: warning: unused variable 'kPathDRMRoot' [-Wunused-variable] 52 | static const char *kPathDRMRoot = "/sys/class/drm"; | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:559:13: warning: unused function 'fileExists' [-Wunused-function] 559 | static bool fileExists(char const *filename) | ^~~~~~~~~~ 6 warnings generated when compiling for gfx90a. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:105:33: warning: bitwise negation of a boolean expression always evaluates to 'true'; did you mean logical negation? [-Wbool-operation] 105 | if (ret_gpu_id == 0 && ~(ret_unique_id != 0 || ret_loc_id != 0 || ret_unique_id != 0 || ret_vendor != 0) && | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ! /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:103:13: warning: unused variable 'ret_domain' [-Wunused-variable] 103 | int ret_domain = read_node_properties(node_id, "domain", &domain, properties); | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:233:22: warning: unused variable 'hops' [-Wunused-variable] 233 | uint64_t hops; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:70:14: warning: unused variable 'count' [-Wunused-variable] 70 | uint32_t count = 0; | ^~~~~ [ 48%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/argcheck.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/argcheck.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/argcheck.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/argcheck.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:52:20: warning: unused variable 'kPathDRMRoot' [-Wunused-variable] 52 | static const char *kPathDRMRoot = "/sys/class/drm"; | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:559:13: warning: unused function 'fileExists' [-Wunused-function] 559 | static bool fileExists(char const *filename) | ^~~~~~~~~~ 6 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:105:33: warning: bitwise negation of a boolean expression always evaluates to 'true'; did you mean logical negation? [-Wbool-operation] 105 | if (ret_gpu_id == 0 && ~(ret_unique_id != 0 || ret_loc_id != 0 || ret_unique_id != 0 || ret_vendor != 0) && | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ! /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:103:13: warning: unused variable 'ret_domain' [-Wunused-variable] 103 | int ret_domain = read_node_properties(node_id, "domain", &domain, properties); | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:233:22: warning: unused variable 'hops' [-Wunused-variable] 233 | uint64_t hops; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:70:14: warning: unused variable 'count' [-Wunused-variable] 70 | uint32_t count = 0; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:52:20: warning: unused variable 'kPathDRMRoot' [-Wunused-variable] 52 | static const char *kPathDRMRoot = "/sys/class/drm"; | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/alt_rsmi.cc:559:13: warning: unused function 'fileExists' [-Wunused-function] 559 | static bool fileExists(char const *filename) | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. 6 warnings generated when compiling for host. [ 48%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/api_trace.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/api_trace.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/api_trace.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/api_trace.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc:3: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc:3: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc:3: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc:3: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc:3: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc:3: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc:3: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc:3: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/argcheck.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/argcheck.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 48%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/ibvsymbols.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/ibvsymbols.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/ibvsymbols.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/ibvsymbols.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc:3: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc:67: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc:3: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc:67: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/api_trace.cc:3: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for host. [ 49%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/ibvwrap.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/ibvwrap.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/ibvwrap.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/ibvwrap.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc:67: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h:21: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc:67: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h:21: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc:67: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h:21: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc:67: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h:21: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc:67: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h:21: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc:67: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h:21: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc:67: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h:21: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc:67: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h:21: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvsymbols.cc:67: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for host. [ 49%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/ipcsocket.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/ipcsocket.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/ipcsocket.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/ipcsocket.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h:21: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h:21: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ibvwrap.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/ibvwrap.h:21: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for host. [ 49%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/npkit.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/npkit.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/npkit.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/npkit.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. 1 warning generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/ipcsocket.cc:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for host. [ 49%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/nvmlwrap_stub.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/nvmlwrap_stub.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/nvmlwrap_stub.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/nvmlwrap_stub.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/nvmlwrap_stub.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/npkit/npkit.h:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/npkit.cc:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 50%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/param.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/param.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/param.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/param.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/param.cc [ 50%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/profiler.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/profiler.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/profiler.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/profiler.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. [ 50%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/rocm_smi_wrap.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/rocm_smi_wrap.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/rocm_smi_wrap.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/rocm_smi_wrap.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/profiler.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/profiler.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/proxy.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/info.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 50%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/rocmwrap.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/rocmwrap.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/rocmwrap.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/rocmwrap.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocmwrap.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/rocm_smi_wrap.cc:24: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for host. [ 51%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/roctx.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/roctx.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/roctx.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/roctx.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1101. [ 51%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/shmutils.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/shmutils.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/shmutils.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/shmutils.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/roctx.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/roctx.h:18: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ [ 51%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/signals.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/signals.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/signals.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/signals.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/signals.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/shmutils.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 51%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/socket.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/socket.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/socket.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/socket.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:602:8: warning: unused variable 'line' [-Wunused-variable] 602 | char line[SOCKET_NAME_MAXLEN+1]; | ^~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:602:8: warning: unused variable 'line' [-Wunused-variable] 602 | char line[SOCKET_NAME_MAXLEN+1]; | ^~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:602:8: warning: unused variable 'line' [-Wunused-variable] 602 | char line[SOCKET_NAME_MAXLEN+1]; | ^~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:602:8: warning: unused variable 'line' [-Wunused-variable] 602 | char line[SOCKET_NAME_MAXLEN+1]; | ^~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1102. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:602:8: warning: unused variable 'line' [-Wunused-variable] 602 | char line[SOCKET_NAME_MAXLEN+1]; | ^~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:602:8: warning: unused variable 'line' [-Wunused-variable] 602 | char line[SOCKET_NAME_MAXLEN+1]; | ^~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:602:8: warning: unused variable 'line' [-Wunused-variable] 602 | char line[SOCKET_NAME_MAXLEN+1]; | ^~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. [ 51%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/strongstream.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/strongstream.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/strongstream.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/strongstream.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/strongstream.cc /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:602:8: warning: unused variable 'line' [-Wunused-variable] 602 | char line[SOCKET_NAME_MAXLEN+1]; | ^~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:602:8: warning: unused variable 'line' [-Wunused-variable] 602 | char line[SOCKET_NAME_MAXLEN+1]; | ^~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:602:8: warning: unused variable 'line' [-Wunused-variable] 602 | char line[SOCKET_NAME_MAXLEN+1]; | ^~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:602:8: warning: unused variable 'line' [-Wunused-variable] 602 | char line[SOCKET_NAME_MAXLEN+1]; | ^~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/socket.cc:9: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 52%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/tuner.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/tuner.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/tuner.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/tuner.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. [ 52%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/utils.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/utils.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/utils.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/utils.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/tuner.cc:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/tuner.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 52%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_lifecycle.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_lifecycle.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_lifecycle.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_lifecycle.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:517:10: warning: unused variable 'nBytes' [-Wunused-variable] 517 | size_t nBytes = count * ncclTypeSize(dataType); | ^~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:17: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:22: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:33:20: warning: unused variable 'mscclAlgoFilePathEnv' [-Wunused-variable] 33 | static const char* mscclAlgoFilePathEnv = "MSCCL_ALGO_FILE_PATH"; | ^~~~~~~~~~~~~~~~~~~~ 15 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx906. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:517:10: warning: unused variable 'nBytes' [-Wunused-variable] 517 | size_t nBytes = count * ncclTypeSize(dataType); | ^~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:17: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:22: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:33:20: warning: unused variable 'mscclAlgoFilePathEnv' [-Wunused-variable] 33 | static const char* mscclAlgoFilePathEnv = "MSCCL_ALGO_FILE_PATH"; | ^~~~~~~~~~~~~~~~~~~~ 15 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:517:10: warning: unused variable 'nBytes' [-Wunused-variable] 517 | size_t nBytes = count * ncclTypeSize(dataType); | ^~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:17: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:22: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:33:20: warning: unused variable 'mscclAlgoFilePathEnv' [-Wunused-variable] 33 | static const char* mscclAlgoFilePathEnv = "MSCCL_ALGO_FILE_PATH"; | ^~~~~~~~~~~~~~~~~~~~ 15 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:517:10: warning: unused variable 'nBytes' [-Wunused-variable] 517 | size_t nBytes = count * ncclTypeSize(dataType); | ^~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:17: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:22: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:33:20: warning: unused variable 'mscclAlgoFilePathEnv' [-Wunused-variable] 33 | static const char* mscclAlgoFilePathEnv = "MSCCL_ALGO_FILE_PATH"; | ^~~~~~~~~~~~~~~~~~~~ 15 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/utils.cc:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 1 warning generated when compiling for host. [ 52%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_parser.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_parser.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_parser.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_parser.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:517:10: warning: unused variable 'nBytes' [-Wunused-variable] 517 | size_t nBytes = count * ncclTypeSize(dataType); | ^~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:17: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:22: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:33:20: warning: unused variable 'mscclAlgoFilePathEnv' [-Wunused-variable] 33 | static const char* mscclAlgoFilePathEnv = "MSCCL_ALGO_FILE_PATH"; | ^~~~~~~~~~~~~~~~~~~~ 15 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:712:16: warning: unused variable 'ret' [-Wunused-variable] 712 | ncclResult_t ret = ncclSuccess; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:724:16: warning: unused variable 'ret' [-Wunused-variable] 724 | ncclResult_t ret = ncclSuccess; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 4 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:517:10: warning: unused variable 'nBytes' [-Wunused-variable] 517 | size_t nBytes = count * ncclTypeSize(dataType); | ^~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:17: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:22: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:33:20: warning: unused variable 'mscclAlgoFilePathEnv' [-Wunused-variable] 33 | static const char* mscclAlgoFilePathEnv = "MSCCL_ALGO_FILE_PATH"; | ^~~~~~~~~~~~~~~~~~~~ 15 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:712:16: warning: unused variable 'ret' [-Wunused-variable] 712 | ncclResult_t ret = ncclSuccess; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:724:16: warning: unused variable 'ret' [-Wunused-variable] 724 | ncclResult_t ret = ncclSuccess; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 4 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:517:10: warning: unused variable 'nBytes' [-Wunused-variable] 517 | size_t nBytes = count * ncclTypeSize(dataType); | ^~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:17: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:22: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:33:20: warning: unused variable 'mscclAlgoFilePathEnv' [-Wunused-variable] 33 | static const char* mscclAlgoFilePathEnv = "MSCCL_ALGO_FILE_PATH"; | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 15 warnings generated when compiling for gfx906. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:712:16: warning: unused variable 'ret' [-Wunused-variable] 712 | ncclResult_t ret = ncclSuccess; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:724:16: warning: unused variable 'ret' [-Wunused-variable] 724 | ncclResult_t ret = ncclSuccess; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 4 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:517:10: warning: unused variable 'nBytes' [-Wunused-variable] 517 | size_t nBytes = count * ncclTypeSize(dataType); | ^~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:17: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:22: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:33:20: warning: unused variable 'mscclAlgoFilePathEnv' [-Wunused-variable] 33 | static const char* mscclAlgoFilePathEnv = "MSCCL_ALGO_FILE_PATH"; | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:712:16: warning: unused variable 'ret' [-Wunused-variable] 712 | ncclResult_t ret = ncclSuccess; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:724:16: warning: unused variable 'ret' [-Wunused-variable] 724 | ncclResult_t ret = ncclSuccess; | ^~~ 15 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 4 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:517:10: warning: unused variable 'nBytes' [-Wunused-variable] 517 | size_t nBytes = count * ncclTypeSize(dataType); | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:712:16: warning: unused variable 'ret' [-Wunused-variable] 712 | ncclResult_t ret = ncclSuccess; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:724:16: warning: unused variable 'ret' [-Wunused-variable] 724 | ncclResult_t ret = ncclSuccess; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:17: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:22: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:33:20: warning: unused variable 'mscclAlgoFilePathEnv' [-Wunused-variable] 33 | static const char* mscclAlgoFilePathEnv = "MSCCL_ALGO_FILE_PATH"; | ^~~~~~~~~~~~~~~~~~~~ 4 warnings generated when compiling for gfx1200. 15 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:712:16: warning: unused variable 'ret' [-Wunused-variable] 712 | ncclResult_t ret = ncclSuccess; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:724:16: warning: unused variable 'ret' [-Wunused-variable] 724 | ncclResult_t ret = ncclSuccess; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:517:10: warning: unused variable 'nBytes' [-Wunused-variable] 517 | size_t nBytes = count * ncclTypeSize(dataType); | ^~~~~~ 4 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:17: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:22: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:33:20: warning: unused variable 'mscclAlgoFilePathEnv' [-Wunused-variable] 33 | static const char* mscclAlgoFilePathEnv = "MSCCL_ALGO_FILE_PATH"; | ^~~~~~~~~~~~~~~~~~~~ 15 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:712:16: warning: unused variable 'ret' [-Wunused-variable] 712 | ncclResult_t ret = ncclSuccess; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:724:16: warning: unused variable 'ret' [-Wunused-variable] 724 | ncclResult_t ret = ncclSuccess; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 4 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/graph.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:517:10: warning: unused variable 'nBytes' [-Wunused-variable] 517 | size_t nBytes = count * ncclTypeSize(dataType); | ^~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:17: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:19: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:22: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:75:21: warning: unused function 'mscclXmlGetAttrInt' [-Wunused-function] 75 | static ncclResult_t mscclXmlGetAttrInt(struct mscclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:82:21: warning: unused function 'mscclXmlGetAttrInt64' [-Wunused-function] 82 | static ncclResult_t mscclXmlGetAttrInt64(struct mscclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_parser.h:89:21: warning: unused function 'mscclXmlFindTag' [-Wunused-function] 89 | static ncclResult_t mscclXmlFindTag(struct mscclXml* xml, const char* tagName, struct mscclXmlNode** node) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_lifecycle.cc:33:20: warning: unused variable 'mscclAlgoFilePathEnv' [-Wunused-variable] 33 | static const char* mscclAlgoFilePathEnv = "MSCCL_ALGO_FILE_PATH"; | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:712:16: warning: unused variable 'ret' [-Wunused-variable] 712 | ncclResult_t ret = ncclSuccess; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:724:16: warning: unused variable 'ret' [-Wunused-variable] 724 | ncclResult_t ret = ncclSuccess; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 4 warnings generated when compiling for gfx908. 15 warnings generated when compiling for host. [ 53%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_setup.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_setup.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_setup.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_setup.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:712:16: warning: unused variable 'ret' [-Wunused-variable] 712 | ncclResult_t ret = ncclSuccess; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:724:16: warning: unused variable 'ret' [-Wunused-variable] 724 | ncclResult_t ret = ncclSuccess; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 4 warnings generated when compiling for gfx90a. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:128:27: warning: unused variable 'threadLocalStatus' [-Wunused-variable] 128 | mscclThreadLocalStatus& threadLocalStatus = mscclGetThreadLocalStatus(); | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 3 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:712:16: warning: unused variable 'ret' [-Wunused-variable] 712 | ncclResult_t ret = ncclSuccess; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:724:16: warning: unused variable 'ret' [-Wunused-variable] 724 | ncclResult_t ret = ncclSuccess; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 4 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:128:27: warning: unused variable 'threadLocalStatus' [-Wunused-variable] 128 | mscclThreadLocalStatus& threadLocalStatus = mscclGetThreadLocalStatus(); | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 3 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:16: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:712:16: warning: unused variable 'ret' [-Wunused-variable] 712 | ncclResult_t ret = ncclSuccess; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:724:16: warning: unused variable 'ret' [-Wunused-variable] 724 | ncclResult_t ret = ncclSuccess; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_parser.cc:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 4 warnings generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ [ 53%] Building CXX object CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_status.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_status.cc.o -MF CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_status.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_status.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:128:27: warning: unused variable 'threadLocalStatus' [-Wunused-variable] 128 | mscclThreadLocalStatus& threadLocalStatus = mscclGetThreadLocalStatus(); | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 3 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1030. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:128:27: warning: unused variable 'threadLocalStatus' [-Wunused-variable] 128 | mscclThreadLocalStatus& threadLocalStatus = mscclGetThreadLocalStatus(); | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 3 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1100. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:128:27: warning: unused variable 'threadLocalStatus' [-Wunused-variable] 128 | mscclThreadLocalStatus& threadLocalStatus = mscclGetThreadLocalStatus(); | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 3 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1101. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:128:27: warning: unused variable 'threadLocalStatus' [-Wunused-variable] 128 | mscclThreadLocalStatus& threadLocalStatus = mscclGetThreadLocalStatus(); | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 3 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1102. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:128:27: warning: unused variable 'threadLocalStatus' [-Wunused-variable] 128 | mscclThreadLocalStatus& threadLocalStatus = mscclGetThreadLocalStatus(); | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 3 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1200. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:128:27: warning: unused variable 'threadLocalStatus' [-Wunused-variable] 128 | mscclThreadLocalStatus& threadLocalStatus = mscclGetThreadLocalStatus(); | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 3 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1201. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:128:27: warning: unused variable 'threadLocalStatus' [-Wunused-variable] 128 | mscclThreadLocalStatus& threadLocalStatus = mscclGetThreadLocalStatus(); | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 3 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:128:27: warning: unused variable 'threadLocalStatus' [-Wunused-variable] 128 | mscclThreadLocalStatus& threadLocalStatus = mscclGetThreadLocalStatus(); | ^~~~~~~~~~~~~~~~~ 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 3 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:128:27: warning: unused variable 'threadLocalStatus' [-Wunused-variable] 128 | mscclThreadLocalStatus& threadLocalStatus = mscclGetThreadLocalStatus(); | ^~~~~~~~~~~~~~~~~ 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_setup.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/channel.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 3 warnings generated when compiling for host. [ 53%] Building CXX object CMakeFiles/rccl.dir/hipify/src/transport/coll_net.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/transport/coll_net.cc.o -MF CMakeFiles/rccl.dir/hipify/src/transport/coll_net.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/transport/coll_net.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:203:21: warning: unused function 'collNetDumpMap' [-Wunused-function] 203 | static ncclResult_t collNetDumpMap(struct connectMap* map) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:406:21: warning: unused function 'sharedBuffersGet' [-Wunused-function] 406 | static ncclResult_t sharedBuffersGet(struct ncclCollNetSharedRes* collNet, int type, int slot, int channel, int* offset) { | ^~~~~~~~~~~~~~~~ 22 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/misc/msccl/msccl_status.cc:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_status.h:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/msccl/msccl_struct.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:203:21: warning: unused function 'collNetDumpMap' [-Wunused-function] 203 | static ncclResult_t collNetDumpMap(struct connectMap* map) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:406:21: warning: unused function 'sharedBuffersGet' [-Wunused-function] 406 | static ncclResult_t sharedBuffersGet(struct ncclCollNetSharedRes* collNet, int type, int slot, int channel, int* offset) { | ^~~~~~~~~~~~~~~~ 22 warnings generated when compiling for gfx1100. 1 warning generated when compiling for host. [ 53%] Building CXX object CMakeFiles/rccl.dir/hipify/src/transport/generic.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/transport/generic.cc.o -MF CMakeFiles/rccl.dir/hipify/src/transport/generic.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/transport/generic.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:203:21: warning: unused function 'collNetDumpMap' [-Wunused-function] 203 | static ncclResult_t collNetDumpMap(struct connectMap* map) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:406:21: warning: unused function 'sharedBuffersGet' [-Wunused-function] 406 | static ncclResult_t sharedBuffersGet(struct ncclCollNetSharedRes* collNet, int type, int slot, int channel, int* offset) { | ^~~~~~~~~~~~~~~~ 22 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:203:21: warning: unused function 'collNetDumpMap' [-Wunused-function] 203 | static ncclResult_t collNetDumpMap(struct connectMap* map) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:406:21: warning: unused function 'sharedBuffersGet' [-Wunused-function] 406 | static ncclResult_t sharedBuffersGet(struct ncclCollNetSharedRes* collNet, int type, int slot, int channel, int* offset) { | ^~~~~~~~~~~~~~~~ 22 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:203:21: warning: unused function 'collNetDumpMap' [-Wunused-function] 203 | static ncclResult_t collNetDumpMap(struct connectMap* map) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:406:21: warning: unused function 'sharedBuffersGet' [-Wunused-function] 406 | static ncclResult_t sharedBuffersGet(struct ncclCollNetSharedRes* collNet, int type, int slot, int channel, int* offset) { | ^~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx1102. 22 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:203:21: warning: unused function 'collNetDumpMap' [-Wunused-function] 203 | static ncclResult_t collNetDumpMap(struct connectMap* map) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:406:21: warning: unused function 'sharedBuffersGet' [-Wunused-function] 406 | static ncclResult_t sharedBuffersGet(struct ncclCollNetSharedRes* collNet, int type, int slot, int channel, int* offset) { | ^~~~~~~~~~~~~~~~ 22 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:203:21: warning: unused function 'collNetDumpMap' [-Wunused-function] 203 | static ncclResult_t collNetDumpMap(struct connectMap* map) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:406:21: warning: unused function 'sharedBuffersGet' [-Wunused-function] 406 | static ncclResult_t sharedBuffersGet(struct ncclCollNetSharedRes* collNet, int type, int slot, int channel, int* offset) { | ^~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx906. 22 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:203:21: warning: unused function 'collNetDumpMap' [-Wunused-function] 203 | static ncclResult_t collNetDumpMap(struct connectMap* map) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:406:21: warning: unused function 'sharedBuffersGet' [-Wunused-function] 406 | static ncclResult_t sharedBuffersGet(struct ncclCollNetSharedRes* collNet, int type, int slot, int channel, int* offset) { | ^~~~~~~~~~~~~~~~ 22 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:203:21: warning: unused function 'collNetDumpMap' [-Wunused-function] 203 | static ncclResult_t collNetDumpMap(struct connectMap* map) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:406:21: warning: unused function 'sharedBuffersGet' [-Wunused-function] 406 | static ncclResult_t sharedBuffersGet(struct ncclCollNetSharedRes* collNet, int type, int slot, int channel, int* offset) { | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 22 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/generic.cc:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 54%] Building CXX object CMakeFiles/rccl.dir/hipify/src/transport/net_tmp.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/transport/net_tmp.cc.o -MF CMakeFiles/rccl.dir/hipify/src/transport/net_tmp.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/transport/net_tmp.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:203:21: warning: unused function 'collNetDumpMap' [-Wunused-function] 203 | static ncclResult_t collNetDumpMap(struct connectMap* map) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:406:21: warning: unused function 'sharedBuffersGet' [-Wunused-function] 406 | static ncclResult_t sharedBuffersGet(struct ncclCollNetSharedRes* collNet, int type, int slot, int channel, int* offset) { | ^~~~~~~~~~~~~~~~ 22 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:356:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 356 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:21: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:285:21: warning: unused function 'netDumpMap' [-Wunused-function] 285 | static ncclResult_t netDumpMap(struct connectMap* map) { | ^~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:17:21: warning: unused function 'collNetDevices' [-Wunused-function] 17 | static ncclResult_t collNetDevices(struct ncclComm* comm, int* ndev) { NCCLCHECK(comm->ncclCollNet->devices(ndev)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:18:21: warning: unused function 'collNetGetProperties' [-Wunused-function] 18 | static ncclResult_t collNetGetProperties(struct ncclComm* comm, int dev, ncclNetProperties_t* props) { NCCLCHECK(comm->ncclCollNet->getProperties(dev, props)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:19:21: warning: unused function 'collNetListen' [-Wunused-function] 19 | static ncclResult_t collNetListen(struct ncclComm* comm, int dev, void* handle, void** listenComm) { NCCLCHECK(comm->ncclCollNet->listen(dev, handle, listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:20:21: warning: unused function 'collNetConnect' [-Wunused-function] 20 | static ncclResult_t collNetConnect(struct ncclComm* comm, void* handles[], int nranks, int rank, void* listenComm, void** collComm) { NCCLCHECK(comm->ncclCollNet->connect(handles, nranks, rank, listenComm, collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:22:21: warning: unused function 'collNetRegMr' [-Wunused-function] 22 | static ncclResult_t collNetRegMr(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMr(collComm, data, size, type, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:24:21: warning: unused function 'collNetRegMrDmaBuf' [-Wunused-function] 24 | static ncclResult_t collNetRegMrDmaBuf(struct ncclComm* comm, void* collComm, void* data, size_t size, int type, uint64_t offset, int fd, void** mhandle) { NCCLCHECK(comm->ncclCollNet->regMrDmaBuf(collComm, data, size, type, offset, fd, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:25:21: warning: unused function 'collNetDeregMr' [-Wunused-function] 25 | static ncclResult_t collNetDeregMr(struct ncclComm* comm, void* collComm, void* mhandle) { NCCLCHECK(comm->ncclCollNet->deregMr(collComm, mhandle)); return ncclSuccess; } | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:26:21: warning: unused function 'collNetIallreduce' [-Wunused-function] 26 | static ncclResult_t collNetIallreduce(struct ncclComm* comm, void* collComm, void* sendData, void* recvData, int count, ncclDataType_t dataType, ncclRedOp_t redOp, void* sendMhandle, void* recvMhandle, void** request) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:28:21: warning: unused function 'collNetIflush' [-Wunused-function] 28 | static ncclResult_t collNetIflush(struct ncclComm* comm, void* collComm, void* data, int size, void* mhandle, void** request) { NCCLCHECK(comm->ncclCollNet->iflush(collComm, data, size, mhandle, request)); return ncclSuccess; } | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:29:21: warning: unused function 'collNetTest' [-Wunused-function] 29 | static ncclResult_t collNetTest(struct ncclComm* comm, void* request, int* done, int* size) { NCCLCHECK(comm->ncclCollNet->test(request, done, size)); return ncclSuccess; } | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:30:21: warning: unused function 'collNetCloseColl' [-Wunused-function] 30 | static ncclResult_t collNetCloseColl(struct ncclComm* comm, void* collComm) { NCCLCHECK(comm->ncclCollNet->closeColl(collComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:31:21: warning: unused function 'collNetCloseListen' [-Wunused-function] 31 | static ncclResult_t collNetCloseListen(struct ncclComm* comm, void* listenComm) { NCCLCHECK(comm->ncclCollNet->closeListen(listenComm)); return ncclSuccess; } | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/coll_net.h:33:12: warning: unused function 'collNetSupport' [-Wunused-function] 33 | static int collNetSupport(struct ncclComm* comm) { return comm->ncclCollNet != nullptr ? 1 : 0; } | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:203:21: warning: unused function 'collNetDumpMap' [-Wunused-function] 203 | static ncclResult_t collNetDumpMap(struct connectMap* map) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/coll_net.cc:406:21: warning: unused function 'sharedBuffersGet' [-Wunused-function] 406 | static ncclResult_t sharedBuffersGet(struct ncclCollNetSharedRes* collNet, int type, int slot, int channel, int* offset) { | ^~~~~~~~~~~~~~~~ 22 warnings generated when compiling for host. [ 54%] Building CXX object CMakeFiles/rccl.dir/hipify/src/transport/net_ib.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/transport/net_ib.cc.o -MF CMakeFiles/rccl.dir/hipify/src/transport/net_ib.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/transport/net_ib.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:356:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 356 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:21: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:285:21: warning: unused function 'netDumpMap' [-Wunused-function] 285 | static ncclResult_t netDumpMap(struct connectMap* map) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:30: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 24 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:30: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:356:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 356 | hipGetLastError(); | ^~~~~~~~~~~~~~~ 24 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:21: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:285:21: warning: unused function 'netDumpMap' [-Wunused-function] 285 | static ncclResult_t netDumpMap(struct connectMap* map) { | ^~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:30: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 24 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:356:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 356 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:21: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:285:21: warning: unused function 'netDumpMap' [-Wunused-function] 285 | static ncclResult_t netDumpMap(struct connectMap* map) { | ^~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:30: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 24 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:356:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 356 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:21: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static nIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:30: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21cclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncc: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ lTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:285:21: warning: unused function 'netDumpMap' [-Wunused-function] 285 | static ncclResult_t netDumpMap(struct connectMap* map) { | ^~~~~~~~~~ 24 warnings generated when compiling for gfx1200. 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:30: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 24 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:356:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 356 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:21: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:285:21: warning: unused function 'netDumpMap' [-Wunused-function] 285 | static ncclResult_t netDumpMap(struct connectMap* map) { | ^~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:30: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 24 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:356:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 356 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:21: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:285:21: warning: unused function 'netDumpMap' [-Wunused-function] 285 | static ncclResult_t netDumpMap(struct connectMap* map) { | ^~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:30: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 24 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:30: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 24 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:356:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 356 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:21: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:285:21: warning: unused function 'netDumpMap' [-Wunused-function] 285 | static ncclResult_t netDumpMap(struct connectMap* map) { | ^~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:30: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ 24 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:356:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 356 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:21: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:285:21: warning: unused function 'netDumpMap' [-Wunused-function] 285 | static ncclResult_t netDumpMap(struct connectMap* map) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/net.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 17 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_ib.cc:30: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:76:21: warning: unused function 'xmlAlloc' [-Wunused-function] 76 | static ncclResult_t xmlAlloc(struct ncclXml** xml, int maxNodes) { | ^~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:111:21: warning: unused function 'xmlGetAttrInt' [-Wunused-function] 111 | static ncclResult_t xmlGetAttrInt(struct ncclXmlNode* node, const char* attrName, int* value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:118:21: warning: unused function 'xmlGetAttrIntDefault' [-Wunused-function] 118 | static ncclResult_t xmlGetAttrIntDefault(struct ncclXmlNode* node, const char* attrName, int* value, int defaultValue) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:125:21: warning: unused function 'xmlGetAttrLong' [-Wunused-function] 125 | static ncclResult_t xmlGetAttrLong(struct ncclXmlNode* node, const char* attrName, int64_t* value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:133:21: warning: unused function 'xmlGetAttrFloat' [-Wunused-function] 133 | static ncclResult_t xmlGetAttrFloat(struct ncclXmlNode* node, const char* attrName, float* value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:140:21: warning: unused function 'xmlFindTag' [-Wunused-function] 140 | static ncclResult_t xmlFindTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:152:21: warning: unused function 'xmlFindNextTag' [-Wunused-function] 152 | static ncclResult_t xmlFindNextTag(struct ncclXml* xml, const char* tagName, struct ncclXmlNode* prev, struct ncclXmlNode** node) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:164:21: warning: unused function 'xmlFindTagKv' [-Wunused-function] 164 | static ncclResult_t xmlFindTagKv(struct ncclXml* xml, const char* tagName, struct ncclXmlNode** node, const char* attrName, const char* attrValue) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:180:21: warning: unused function 'xmlFindNode' [-Wunused-function] 180 | static ncclResult_t xmlFindNode(struct ncclXmlNode* parentNode, struct ncclXmlNode* searchNode, struct ncclXmlNode** node) { | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:203:21: warning: unused function 'xmlSetAttr' [-Wunused-function] 203 | static ncclResult_t xmlSetAttr(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:216:21: warning: unused function 'xmlSetAttrIfUnset' [-Wunused-function] 216 | static ncclResult_t xmlSetAttrIfUnset(struct ncclXmlNode* node, const char* attrName, const char* value) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:228:21: warning: unused function 'xmlSetAttrInt' [-Wunused-function] 228 | static ncclResult_t xmlSetAttrInt(struct ncclXmlNode* node, const char* attrName, const int value) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:241:21: warning: unused function 'xmlSetAttrFloat' [-Wunused-function] 241 | static ncclResult_t xmlSetAttrFloat(struct ncclXmlNode* node, const char* attrName, const float value) { | ^~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:254:21: warning: unused function 'xmlSetAttrLong' [-Wunused-function] 254 | static ncclResult_t xmlSetAttrLong(struct ncclXmlNode* node, const char* attrName, const int64_t value) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:267:21: warning: unused function 'xmlUnsetAttr' [-Wunused-function] 267 | static ncclResult_t xmlUnsetAttr(struct ncclXmlNode* node, const char* attrName) { | ^~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:279:21: warning: unused function 'xmlGetSub' [-Wunused-function] 279 | static ncclResult_t xmlGetSub(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:305:21: warning: unused function 'xmlGetSubKvInt' [-Wunused-function] 305 | static ncclResult_t xmlGetSubKvInt(struct ncclXmlNode* node, const char* subName, struct ncclXmlNode** sub, const char* attrName, const int attrValue) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:312:21: warning: unused function 'xmlAddNode' [-Wunused-function] 312 | static ncclResult_t xmlAddNode(struct ncclXml* xml, struct ncclXmlNode* parent, const char* subName, struct ncclXmlNode** sub) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:334:21: warning: unused function 'xmlRemoveNode' [-Wunused-function] 334 | static ncclResult_t xmlRemoveNode(struct ncclXmlNode* node) { | ^~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:347:21: warning: 'static' function 'xmlAddTree' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] 347 | static ncclResult_t xmlAddTree(struct ncclXml* dst, struct ncclXmlNode* parent, struct ncclXmlNode* srcNode) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:377:21: warning: unused function 'kvConvertToInt' [-Wunused-function] 377 | static ncclResult_t kvConvertToInt(const char* str, int* value, struct kvDict* dict) { | ^~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/xml.h:390:21: warning: unused function 'kvConvertToStr' [-Wunused-function] 390 | static ncclResult_t kvConvertToStr(int value, const char** str, struct kvDict* dict) { | ^~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 24 warnings generated when compiling for host. [ 54%] Building CXX object CMakeFiles/rccl.dir/hipify/src/transport/net_socket.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/transport/net_socket.cc.o -MF CMakeFiles/rccl.dir/hipify/src/transport/net_socket.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/transport/net_socket.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:356:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 356 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:21: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:285:21: warning: unused function 'netDumpMap' [-Wunused-function] 285 | static ncclResult_t netDumpMap(struct connectMap* map) { | ^~~~~~~~~~ 17 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:183:14: warning: unused variable 'info' [-Wunused-variable] 183 | gdr_info_t info; | ^~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:185:12: warning: unused variable 'mh' [-Wunused-variable] 185 | gdr_mh_t mh; | ^~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:187:9: warning: unused variable 'gdrMap' [-Wunused-variable] 187 | void *gdrMap; | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:219:19: warning: unused variable 'md' [-Wunused-variable] 219 | gdr_mem_desc_t *md = (gdr_mem_desc_t*)gdrHandle; | ^~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:356:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 356 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:15: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/gdrwrap.h:163:14: warning: unused function 'ncclGdrInit' [-Wunused-function] 163 | static gdr_t ncclGdrInit() { | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:21: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_tmp.cc:285:21: warning: unused function 'netDumpMap' [-Wunused-function] 285 | static ncclResult_t netDumpMap(struct connectMap* map) { | ^~~~~~~~~~ 2 warnings generated when compiling for gfx1100. 17 warnings generated when compiling for host. [ 54%] Building CXX object CMakeFiles/rccl.dir/hipify/src/transport/nvls.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/transport/nvls.cc.o -MF CMakeFiles/rccl.dir/hipify/src/transport/nvls.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/transport/nvls.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/net_socket.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ [ 54%] Building CXX object CMakeFiles/rccl.dir/hipify/src/transport/p2p.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/transport/p2p.cc.o -MF CMakeFiles/rccl.dir/hipify/src/transport/p2p.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/transport/p2p.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:325:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 325 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 11 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:325:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 325 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 11 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/nvls.cc:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 55%] Building CXX object CMakeFiles/rccl.dir/hipify/src/transport/shm.cc.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/src/transport/shm.cc.o -MF CMakeFiles/rccl.dir/hipify/src/transport/shm.cc.o.d -o CMakeFiles/rccl.dir/hipify/src/transport/shm.cc.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:325:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 325 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 11 warnings generated when compiling for gfx1101. 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:325:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 325 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 2 warnings generated when compiling for gfx1100. 11 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:325:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 325 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 2 warnings generated when compiling for gfx1101. 11 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:325:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 325 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 2 warnings generated when compiling for gfx1102. 11 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:325:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 325 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 2 warnings generated when compiling for gfx1200. 11 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:325:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 325 | hipGetLastError(); | ^~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 11 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:325:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 325 | hipGetLastError(); | ^~~~~~~~~~~~~~~ 2 warningIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ s generated when compiling for gfx906. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:325:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 325 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 2 warnings generated when compiling for gfx908. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:325:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 325 | hipGetLastError(); | ^~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/p2p.cc:13: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:215:21: warning: unused function 'ncclTopoIdToIndex' [-Wunused-function] 215 | static ncclResult_t ncclTopoIdToIndex(struct ncclTopoSystem* system, int type, int64_t id, int* index) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:226:21: warning: unused function 'ncclTopoRankToIndex' [-Wunused-function] 226 | static ncclResult_t ncclTopoRankToIndex(struct ncclTopoSystem* system, int rank, int* index) { | ^~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:237:21: warning: unused function 'ncclTopoDevToRank' [-Wunused-function] 237 | static ncclResult_t ncclTopoDevToRank(struct ncclTopoSystem* system, int dev, int* rank) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:249:21: warning: unused function 'ncclTopoIdToNetDev' [-Wunused-function] 249 | static ncclResult_t ncclTopoIdToNetDev(struct ncclTopoSystem* system, int64_t id, int* netDev) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:262:14: warning: unused function 'ncclTopoXGMISpeed' [-Wunused-function] 262 | static float ncclTopoXGMISpeed(const char* gcn) { | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:272:14: warning: unused function 'ncclTopoNVLinkBw' [-Wunused-function] 272 | static float ncclTopoNVLinkBw(int cudaCompCap) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:283:13: warning: unused function 'isPow2' [-Wunused-function] 283 | static bool isPow2(int val) { | ^~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/graph/topo.h:286:12: warning: unused function 'mirrorBits' [-Wunused-function] 286 | static int mirrorBits(int val, int pow2) { | ^~~~~~~~~~ 2 warnings generated when compiling for gfx90a. 11 warnings generated when compiling for host. [ 55%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_gather_sum_i8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_gather_sum_i8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_gather_sum_i8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_gather_sum_i8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:20:15: warning: unused variable 'bid' [-Wunused-variable] 20 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_4, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:14: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/transport/shm.cc:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/comm.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/p2p.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/core.h:38: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/alloc.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/utils.h:44:13: warning: unused function 'log2i' [-Wunused-function] 44 | static long log2i(long n) { | ^~~~~ 2 warnings generated when compiling for host. [ 55%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 11 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:20:15: warning: unused variable 'bid' [-Wunused-variable] 20 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_4, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:20:15: warning: unused variable 'bid' [-Wunused-variable] 20 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_4, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx1101. 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:20:15: warning: unused variable 'bid' [-Wunused-variable] 20 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_4, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:20:15: warning: unused variable 'bid' [-Wunused-variable] 20 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_4, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:20:15: warning: unused variable 'bid' [-Wunused-variable] 20 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_4, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:20:15: warning: unused variable 'bid' [-Wunused-variable] 20 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_4, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:20:15: warning: unused variable 'bid' [-Wunused-variable] 20 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_4, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. 11 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:20:15: warning: unused variable 'bid' [-Wunused-variable] 20 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:171:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 171 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllGather_RING_LL128_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_4, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 12 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:20:15: warning: unused variable 'bid' [-Wunused-variable] 20 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:171:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 171 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllGather_RING_LL128_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_4, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. 12 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:20:15: warning: unused variable 'bid' [-Wunused-variable] 20 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_2, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:58:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 58 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_gather.h:157:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 157 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_gather_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(AllGather_RING_SIMPLE_Sum_i8_4, ncclFuncAllGather, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for host. [ 55%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCLIn file included from _/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncAllReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 56%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncMinMax<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncMinMax<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncMinMax<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncMinMax<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncMinMax<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncMinMax<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncAllReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 56%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_2, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f16_4, ncclFuncAllReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 56%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_17 warnings generated when compiling for gfx906. TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_2, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f32_4, ncclFuncAllReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 56%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_2, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f64_4, ncclFuncAllReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 57%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_2, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u32_4, ncclFuncAllReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 57%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEF/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ INE_ncclDevFunc(AllReduce_RING_LL128_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllRed/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.hu:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ ce_RING_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ , RedOp, Algo, Proto, COLL_UNROLL>().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_2, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u64_4, ncclFuncAllReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. 17 warnings generated when compiling for host. [ 57%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_2, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_f8_4, ncclFuncAllReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ [ 57%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_2, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_minmax_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_MinMax_u8_4, ncclFuncAllReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 58%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncAllReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 58%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncPreMulSum, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncPreMulSum, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncPreMulSum, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncPreMulSum, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncPreMulSum, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncPreMulSum, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncAllReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 58%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncAllReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 58%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 58%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncAllReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 59%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncAllReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 59%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncAllReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 59%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ TreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSizeIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ _) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested hereIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 565 | runTree/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ UpDownchannelLo; | ^~~ ProtoSimple<1, 1, COLL_UNROLL>, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncAllReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ [ 59%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_premulsum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncAllReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 60%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_2, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf16_4, ncclFuncAllReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 60%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncProd<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncProd<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncProd<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncProd<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncProd<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncProd<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_2, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f16_4, ncclFuncAllReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 60%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_2, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_bf8_4, ncclFuncAllReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 60%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ LE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ es[NCCL_PROTOIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ _SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0>In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RuIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ nWorkCollchannelLo; | ^~~ ROLL>().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 670 | tid(tid), nthreads(nthreads/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ ), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_2, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f32_4, ncclFuncAllReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 61%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_2, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f64_4, ncclFuncAllReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 61%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h :29670: | 15 : note: expanded from macro 'barrier_by_group't id(tid )29, | n t h rceoandsst( nitnhtr ewa d=s )t,h rteiaddIIndBxl.oxc/kW(AtRhPr_eSaIdZIEd;x .x), group(group), | \ ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | | ^ tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShIn file included from m/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ em.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ clShmem.channelId - work->channelLo; | ^~~/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSizIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ e_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ COLL_UNROLL>().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(thread/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ Idx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_2, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u32_4, ncclFuncAllReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 61%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads) | , tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_2, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_f8_4, ncclFuncAllReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 61%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_2, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u64_4, ncclFuncAllReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 61%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_2, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_prod_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Prod_u8_4, ncclFuncAllReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 62%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_2, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf16_4, ncclFuncAllReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 62%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncSum<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncSum<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncSum<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncSum<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncSum<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit<__half, FuncSum<__half>, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_2, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f16.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f16_4, ncclFuncAllReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 62%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_2, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_bf8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_bf8_4, ncclFuncAllReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 17 warnings generated when compiling for host. [ 62%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | In file included from P/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ rimitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, worIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145k); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1::21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15:MPLE, 2) | ^ note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads)In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ , tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthread/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ s(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ ce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_2, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f32_4, ncclFuncAllReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 63%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_2, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f64_4, ncclFuncAllReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 63%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlockIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ (threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ CCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatchchannelLo; | ^~~ ty, redop, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSizeIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ _ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ _SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShm), grem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ oup(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidtInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ hreadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(grou/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ p), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_2, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u32_4, ncclFuncAllReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 63%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_2, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_f8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_f8_4, ncclFuncAllReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 63%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); 17 warnings generated when compiling for gfx906. | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h671: | 12 : In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h :s15t: eIn file included from p/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.hS:i14z: e(/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.hs:t77e:p18S:i zwarning: e_unused variable 'y' [-Wunused-variable] == 0 ? nccl S77h | m e m . c o m m .ubiunftf3S2i_zte sy[,N ChCeLa_dP,R OmTaOn_tSiIsMsPaL;E ] /| N ^C CL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_2, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u64_4, ncclFuncAllReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 64%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | steIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ pSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl(In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ ).run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_2, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sum_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_Sum_u8_4, ncclFuncAllReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 64%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncAllReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 64%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncAllReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 17 warnings generated when compiling for host. [ 64%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ : note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NC/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ CL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthre/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), groupads(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ InBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(t/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tid,idInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncAllReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 64%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncAllReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 65%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1101. 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run()In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ ; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx90a. 17 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLLt_idInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads):1, tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ : note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 21 warnings generated when compiling for gfx942. 17 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncAllReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 65%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/alltoall_pivot_sum_i8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/alltoall_pivot_sum_i8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/alltoall_pivot_sum_i8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/alltoall_pivot_sum_i8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_2, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_4, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_2, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_4, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_2, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_4, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_2, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_4, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_2, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_4, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_2, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_4, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_2, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_4, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_2, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_4, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_2, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_4, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_2, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_4, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_2, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:37:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 37 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/alltoall_pivot.h:82:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 82 | runRing(tid, nThreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/alltoall_pivot_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(AllToAllPivot_RING_SIMPLE_Sum_i8_4, ncclFuncAllToAllPivot, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 65%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/broadcast_sum_i8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/broadcast_sum_i8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/broadcast_sum_i8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/broadcast_sum_i8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:19:15: warning: unused variable 'bid' [-Wunused-variable] 19 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_4, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx1030. 21 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group'In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ barrier_by_group()/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ ; | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15:/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:19:15: warning: unused variable 'bid' [-Wunused-variable] 19 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:421:9: note: in instantiation of member function 'Primitives, FanSymmetric<2>, 0, ProtoLL128, 0>::Primitives' requested here 421 | prims(tid, nthreads, tree->down, tree->down, work->sendbuff, work->recvbuff, work->redOpArg); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:461:9: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoLL128, 0>::Primitives' requested here 461 | prims(tid, nthreadsSplit, tree->down, &tree->up, work->sendbuff, work->recvbuff, work->redOpArg, 0*Proto::MaxGroupWidth); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:503:9: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoLL128, 0>::Primitives' requested here 503 | prims(tid-nthreadsSplit, nthreads-nthreadsSplit, &tree->up, tree->down, work->sendbuff, work->recvbuff, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1070:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeSplit, ProtoLL128, 2>' requested here 1070 | runTreeSplit(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 0, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(AllReduce_TREE_LL128_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_4, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:1062:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 1062 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:10:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 10 | DEFINE_ncclDevFunc(AllReduce_RING_LL128_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:19:15: warning: unused variable 'bid' [-Wunused-variable] 19 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_4, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:19:15: warning: unused variable 'bid' [-Wunused-variable] 19 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_4, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx1102. 21 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ ; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t daIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ ta1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:19:15: warning: unused variable 'bid' [-Wunused-variable] 19 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:27:15: warning: unused variable 'bid' [-Wunused-variable] 27 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:218:15: warning: unused variable 'bid' [-Wunused-variable] 218 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:366:15: warning: unused variable 'bid' [-Wunused-variable] 366 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_4, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ oto, COLL_UNROLL>().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 2>, 2>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 12 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:254:90: note: in instantiation of member function 'Primitives, FanAsymmetric<2, 1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 254 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:303:90: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 2>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 303 | Primitives, /*Direct=*/0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:565:5: note: in instantiation of function template specialization '(anonymous namespace)::runTreeUpDown, ProtoSimple<1, 1, 4>, 4>' requested here 565 | runTreeUpDown, COLL_UNROLL>(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 0, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:17:1: note: in instantiation of member function 'RunWorkBatch, 0, 2, 4>::run' requested here 17 | DEFINE_ncclDevFunc(AllReduce_TREE_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_TREE, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:63:56: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 63 | Primitives, 0, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/all_reduce.h:558:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 558 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp:22:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 22 | DEFINE_ncclDevFunc(AllReduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncAllReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 17 warnings generated when compiling for host. [ 65%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/device_table.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/device_table.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/device_table.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/device_table.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1030. 11 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:19:15: warning: unused variable 'bid' [-Wunused-variable] 19 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_4, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1102. 11 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:19:15: warning: unused variable 'bid' [-Wunused-variable] 19 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_4, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 11 warnings generated when compiling for gfx906. 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:19:15: warning: unused variable 'bid' [-Wunused-variable] 19 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_4, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx90a. 11 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/device_table.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by:_ gwarning: rounused variable 'y' [-Wunused-variable]u p(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h :7729 | : 15 : note: expanded from macro 'barrier_by_group' uin t293 | 2 _ t yc,o nhseta di,n tm awn t=i stshar;e a | d ^I dx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:19:15: warning: unused variable 'bid' [-Wunused-variable] 19 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ 1 warning generated when compiling for host. [ 66%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/host_table.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/host_table.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/host_table.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/host_table.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:111:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 111 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Broadcast_RING_LL128_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_4, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1100. 12 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:19:15: warning: unused variable 'bid' [-Wunused-variable] 19 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:111:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 111 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Broadcast_RING_LL128_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_4, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1200. 12 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:19:15: warning: unused variable 'bid' [-Wunused-variable] 19 | const int bid = ncclShmem.channelId - work->channelLo; | ^~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_2, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:60:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 60 | prims(tid, nthreads, &ring->prev, &ring->next, inputBuf, outputBuf, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/broadcast.h:97:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 97 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/broadcast_sum_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Broadcast_RING_SIMPLE_Sum_i8_4, ncclFuncBroadcast, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for host. [ 66%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/host_table.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for host. [ 66%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 66%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid10 warnings generated when compiling for gfx90a. ), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 67%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t daIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ ta1, flag1, data2, flag2;In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ CCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ [ 67%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 67%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 67%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ [ 67%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 68%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 68%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 68%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ : In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for host. [ 68%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:1410 warnings generated when compiling for gfx1101. : /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ terpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ [ 69%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_MinMax_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(MinMax, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for host. [ 69%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 69%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.hIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ :174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 69%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 70%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 70%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE)In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(th, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ readIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group'In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 70%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 70%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ [ 70%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i8.cpp.o In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ ==10 warnings generated when compiling for gfx1100. 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 71%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthrIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreadeads(ns), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ threads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uiIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads),nt32_t wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 71%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 71%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 71%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Prod_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Prod, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 72%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, hip_bfloat16, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 72%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_bf8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_bfloat8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 72%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f16.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, half, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' [ 72%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f64.cpp.o 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock10 warnings generated when compiling for gfx942. (threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, float, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 73%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, double, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 73%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_f8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, rccl_float8, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 73%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 73%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 74%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:10 warnings generated when compiling for gfx1100. 175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from 10 warnings generated when compiling for gfx1101. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_i8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, int8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for host. [ 74%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u32.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint32_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 74%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u64.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint64_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 74%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoLL128, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoLL128, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:384:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 384 | mscclRunInterpreter, ProtoLL128, fullOps>(comm, algo, work); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:13: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:199:57: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 1>, 1, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 199 | Primitives, 1, Proto, 0> prims | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/msccl_kernel_Sum_u8.cpp:3:1: note: in instantiation of function template specialization 'mscclRunInterpreter, ProtoSimple<2, 2, 2>, false>' requested here 3 | MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE(Sum, uint8_t, false); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/msccl_kernel_impl.h:387:3: note: expanded from macro 'MSCCL_IMPL_KERNEL_ENTRY_FUNC_DEVREDOP_TYPE' 387 | mscclRunInterpreter, ProtoSimple, fullOps>(comm, algo, work); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 74%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduce, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 75%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_4, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_4, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_4, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_4, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_4, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:In file included from 174/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ : /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_4, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_4, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_4, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_4, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduce, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 75%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_4, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_4, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_4, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_2, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f16_4, ncclFuncReduce, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 75%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work-In file included from >r/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cppe:c1v: bIn file included from u/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.hf:f,12 : wIn file included from o/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.hr:k15-: >In file included from r/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.he:d14O: p/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.hA:r77g:, 18:0 ,warning: wunused variable 'y' [-Wunused-variable]o rk->connInde x77, | w o r k- >c onuinIntnd3e2x_)t; y ,| ^h ead, mantissa;/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h : 63:| 5 ^: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_4, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_4, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_4, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_4, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_4, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_4, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_4, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_4, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_4, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_4, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_4, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_4, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_4, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_4, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_4, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_4, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_4, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_2, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f32_4, ncclFuncReduce, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 75%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_4, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_4, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_2, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f64_4, ncclFuncReduce, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. 10 warnings generated when compiling for gfx1030. [ 76%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_4, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_4, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_4, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_4, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_4, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_4, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_4, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_4, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_4, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_4, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_4, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_4, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_4, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_4, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_4, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_4, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_2, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u32_4, ncclFuncReduce, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 76%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_4, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_4, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_4, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_4, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_4, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_4, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_4, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_4, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_4, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_2, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_f8_4, ncclFuncReduce, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 76%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_4, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_4, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_4, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_4, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->preIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ v, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ Proto, COLL_UNROLL>().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_4, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_4, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_4, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_4, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_2, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u64_4, ncclFuncReduce, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 76%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_4, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_4, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_4, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ MulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ ads(nthrea/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.hd:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ s), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_4, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_4, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_4, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_2, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_MinMax_u8_4, ncclFuncReduce, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 77%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduce, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 77%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduce, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 77%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduce, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ [ 77%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduce, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ [ 77%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduce, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 78%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduce, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 78%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.hIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ :75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_byIn file included from _/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduce, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. 11 warnings generated when compiling for gfx90a. [ 78%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduce, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 78%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_4, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_4, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_4, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_4, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_4, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_4, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_4, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.hIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ :11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_4, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_4, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_4, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduce, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 79%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_4, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_2, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf16_4, ncclFuncReduce, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 79%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_4, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_4, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_4, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_4, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_4, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_4, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = tIn file included from h/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ readIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEP/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_4, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ S/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_4, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_4, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:1077 warnings generated when compiling for gfx1201. :18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_4, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_4, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_4, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_4, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_4, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_4, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_4, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_2, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f16_4, ncclFuncReduce, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 79%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, manIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ tissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_4, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_4, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_4, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_4, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_4, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_4, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_4, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_4, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_2, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_bf8_4, ncclFuncReduce, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 79%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_4, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_4, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_4, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_4, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_4, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_4, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_4, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_4, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_4, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_4, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_4, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_2, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f32_4, ncclFuncReduce, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 80%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_4, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_4, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_4, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_4, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_4, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_4, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_4, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_2, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f64_4, ncclFuncReduce, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 80%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_4, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_4, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_4, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBaIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ tch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_4, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_4, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_4, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_4, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_4, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_4, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_4, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_4, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_4, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_4, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_4, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_4, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_4, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_4, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_2, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u32_4, ncclFuncReduce, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 80%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_4, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_2, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_f8_4, ncclFuncReduce, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 80%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_4, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_4, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_4, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_4, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_4, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_4, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_4, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_4, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_4, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_4, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_4, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_4, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_4, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_4, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_4, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_4, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_4, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_4, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_2, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u64_4, ncclFuncReduce, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 80%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_4, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_2, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Prod_u8_4, ncclFuncReduce, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 81%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_2, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf16_4, ncclFuncReduceScatter, FuncMinMax, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 81%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_4, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_4, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_4, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_4, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_4, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_4, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_2, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_bf8_4, ncclFuncReduceScatter, FuncMinMax, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 81%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_4, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_4, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_4, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_4, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_4, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_4, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_4, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_4, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_4, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_4, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_2, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncMinMax<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncMinMax<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f16_4, ncclFuncReduceScatter, FuncMinMax, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 81%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_4, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_4, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_4, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_4, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_4, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_4, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_4, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_4, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_4, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_2, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f32_4, ncclFuncReduceScatter, FuncMinMax, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 82%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_4, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_4, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_4, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_4, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_4, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_4, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_4, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_4, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_2, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f64_4, ncclFuncReduceScatter, FuncMinMax, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 82%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_4, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_4, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_4, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_4, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_4, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_4, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_4, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_4, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_4, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_4, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_4, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_4, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_4, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_4, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_g10 warnings generated when compiling for gfx908. roup(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_4, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_4, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_2, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u32_4, ncclFuncReduceScatter, FuncMinMax, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 82%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_4, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_4, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_4, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_2, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_f8_4, ncclFuncReduceScatter, FuncMinMax, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 82%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_4, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_4, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_4, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_4, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_4, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_4, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_4, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_4, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_4, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_4, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_4, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_4, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_4, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_4, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_4, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_2, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u64_4, ncclFuncReduceScatter, FuncMinMax, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 83%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_4, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_4, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_4, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_2, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_minmax_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_MinMax_u8_4, ncclFuncReduceScatter, FuncMinMax, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 83%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_2, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf16_4, ncclFuncReduceScatter, FuncPreMulSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 83%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_2, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncPreMulSum, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncPreMulSum, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f16_4, ncclFuncReduceScatter, FuncPreMulSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 83%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_bf8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:1510 warnings generated when compiling for host. : In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ [ 83%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_2, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f32_4, ncclFuncReduceScatter, FuncPreMulSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 84%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_2, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f64_4, ncclFuncReduceScatter, FuncPreMulSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 84%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_2, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u32_4, ncclFuncReduceScatter, FuncPreMulSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ [ 84%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_2, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u64_4, ncclFuncReduceScatter, FuncPreMulSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 84%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_2, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_f8_4, ncclFuncReduceScatter, FuncPreMulSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 85%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_4, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 10 warnings generated when compiling for gfx1100. 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_4, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_4, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_4, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_4, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_4, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_4, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_4, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_4, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_4, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_2, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_premulsum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_PreMulSum_u8_4, ncclFuncReduceScatter, FuncPreMulSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 85%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_4, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_2, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf16_4, ncclFuncReduceScatter, FuncProd, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 85%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_4, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_4, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_4, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_4, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from 10 warnings generated when compiling for gfx1101. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_4, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_4, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_4, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_4, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_4, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_4, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_4, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | 10 warnings generated when compiling for gfx1200. tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_4, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_4, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_4, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_4, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_4, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_2, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncProd<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncProd<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f16_4, ncclFuncReduceScatter, FuncProd, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 85%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ : In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_4, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_4, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_4, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_4, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_4, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_4, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_4, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_4, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_2, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_bf8_4, ncclFuncReduceScatter, FuncProd, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 86%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_4, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_4, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_4, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_4, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_4, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_4, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_4, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_4, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdxIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const in.x/WARP_SIZE; t w = threadIdx.x/WARP_SIZE; \ | ^ \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg,In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_4, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_4, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE;In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_4, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_2, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f32_4, ncclFuncReduceScatter, FuncProd, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 86%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_4, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_4, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_4, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_4, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_4, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_4, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_4, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_2, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f64_4, ncclFuncReduceScatter, FuncProd, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 86%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_4, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_4, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_4, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().ru10 warnings generated when compiling for gfx1102. n(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_4, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_4, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_4, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ IMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWoIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ rkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), groupIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ (group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_4, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_4, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_4, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_4, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_4, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_4, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_4, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_4, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_4, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_4, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_4, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_2, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u32_4, ncclFuncReduceScatter, FuncProd, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 86%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_2, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_4, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_f8_4, ncclFuncReduceScatter, FuncProd, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 87%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_4, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_4, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_4, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_4, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_4, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_4, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_4, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_4, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_4, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_4, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_4, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_4, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_4, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_4, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_4, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_4, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_4, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_4, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_2, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u64_4, ncclFuncReduceScatter, FuncProd, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 87%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_4, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_4, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_2, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_prod_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Prod_u8_4, ncclFuncReduceScatter, FuncProd, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 87%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_4, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_4, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_4, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_4, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_4, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_4, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_4, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_4, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_4, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_4, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), groIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreadup(group)s), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ , | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBl/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_4, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ ock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_4, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_4, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_4, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_4, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_4, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_2, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf16_4, ncclFuncReduceScatter, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 87%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_4, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_4, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_4, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_4, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_4, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_4, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_4, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_4, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_2, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), RunWtidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ orkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_4, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_bf8_4, ncclFuncReduceScatter, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 87%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_4, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_4, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_4, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_4, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_4, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_4, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_4, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_4, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_4, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_2, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f16_4, ncclFuncReduceScatter, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 88%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tidIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ , nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_4, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_4, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_4, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_4, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_4, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_4, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_4, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_4, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | conIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ st int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_4, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_4, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_2, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f32_4, ncclFuncReduceScatter, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x),10 warnings generated when compiling for host. group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ [ 88%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_4, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_4, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_4, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_4, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_4, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_4, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_4, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_4, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_2, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f64_4, ncclFuncReduceScatter, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 88%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_4, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_4, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_4, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_4, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_4, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_4, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_4, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_4, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_4, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_4, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_4, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_4, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_4, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_4, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_4, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_4, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_4, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_2, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u32_4, ncclFuncReduceScatter, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 88%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_4, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_2, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_f8_4, ncclFuncReduceScatter, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 89%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_4, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_4, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_4, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_4, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_4, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_4, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_4, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_4, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_4, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ : note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_4, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_4, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_4, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_4, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_4, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_4, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_4, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_4, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_4, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_2, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u64_4, ncclFuncReduceScatter, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 89%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_4, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_2, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_Sum_u8_4, ncclFuncReduceScatter, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for host. [ 89%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidI10 warnings generated when compiling for gfx1101. nBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE10 warnings generated when compiling for gfx1200. _ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduceScatter, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 89%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduceScatter, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 90%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | 10uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduceScatter, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 90%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduceScatter, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 90%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, fIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ lag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group();In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(groupIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ ), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduceScatter, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 90%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_bf16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_bf16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_bf16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_bf16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_4, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:79:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 79 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(ReduceScatter_RING_LL128_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_4, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 2>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 2>, 2>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:34:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<2, 2, 4>, 0>::Primitives' requested here 34 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce_scatter.h:65:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<2, 2, 4>, 4>' requested here 65 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(ReduceScatter_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduceScatter, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 90%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_bf8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_bf8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_bf8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_bf8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_4, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_4, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_4, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_4, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_4, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_4, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_4, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_4, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_4, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_4, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_4, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_4, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_4, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_4, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_2, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf16_4, ncclFuncReduce, FuncSum, hip_bfloat16, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 91%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f16.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f16.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f16.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f16.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_4, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_4, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_4, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_4, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_4, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_4, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group10 warnings generated when compiling for gfx1102. ), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_4, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_4, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_4, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_4, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_4, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_2, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_bf8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_bf8_4, ncclFuncReduce, FuncSum, rccl_bfloat8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 91%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_4, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_4, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_4, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_4, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_4, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_4, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_4, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_2, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives<__half, FuncSum<__half>, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing<__half, FuncSum<__half>, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f16.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f16_4, ncclFuncReduce, FuncSum, half, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 91%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_4, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_4, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_4, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_4, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_4, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_4, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_4, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_4, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_4, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_4, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ : In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_4, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_4, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_2, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f32_4, ncclFuncReduce, FuncSum, float, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? nccl10 warnings generated when compiling for host. Shmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ [ 91%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_4, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_4, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_4, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_4, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_4, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_4, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_4, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_2, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f64_4, ncclFuncReduce, FuncSum, double, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 92%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_4, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_4, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_4, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_4, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_4, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_4, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_4, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_4, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_4, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_4, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_4, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_4, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_4, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_4, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_4, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_4, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_2, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u32_4, ncclFuncReduce, FuncSum, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 92%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_4, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_4, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_4, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_2, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_f8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_f8_4, ncclFuncReduce, FuncSum, rccl_float8, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 92%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_4, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_4, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_4, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_4, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_4, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_4, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_4, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_4, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_4, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_4, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_4, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_4, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | consIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ t int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_4, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_4, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_4, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_4, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_4, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_2, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u64_4, ncclFuncReduce, FuncSum, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 92%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_4, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_2, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sum_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_Sum_u8_4, ncclFuncReduce, FuncSum, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 93%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_2, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i32_4, ncclFuncReduce, FuncSumPostDiv, int32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 93%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_2, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i64_4, ncclFuncReduce, FuncSumPostDiv, int64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 93%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u32.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u32.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u32.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u32.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1102. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBat11 warnings generated when compiling for gfx90a. ch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_2, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_i8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_i8_4, ncclFuncReduce, FuncSumPostDiv, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 93%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u64.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u64.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u64.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u64.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_2, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u32.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u32_4, ncclFuncReduce, FuncSumPostDiv, uint32_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 93%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1100. 10 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1101. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), ti/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ dInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1200. 10 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 10 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx906. 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_2, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u64.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u64_4, ncclFuncReduce, FuncSumPostDiv, uint64_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 94%] Building CXX object CMakeFiles/rccl.dir/hipify/gensrc/sendrecv_sum_i8.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/hipify/gensrc/sendrecv_sum_i8.cpp.o -MF CMakeFiles/rccl.dir/hipify/gensrc/sendrecv_sum_i8.cpp.o.d -o CMakeFiles/rccl.dir/hipify/gensrc/sendrecv_sum_i8.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ 11 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:498:29: warning: field 'group' will be initialized after field 'stepSize' [-Wreorder-ctor] 496 | tid(tid), nthreads(nthreads), wid(tid%WARP_SIZE), warp(tid/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~ | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t) 497 | warpInBlock(threadIdx.x/WARP_SIZE), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | warp(tid/WARP_SIZE 498 | flagThread((tid%4)==3), group(group), | ~~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ | warpInBlock(threadIdx.x/WARP_SIZE flagThread((tid%4)==3 499 | stepSize(ncclShmem.comm.buffSizes[NCCL_PROTO_LL128]/NCCL_STEPS/sizeof(uint64_t)) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoLL128, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:77:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoLL128, 2>' requested here 77 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 1, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:5:1: note: in instantiation of member function 'RunWorkBatch, 1, 1, 2>::run' requested here 5 | DEFINE_ncclDevFunc(Reduce_RING_LL128_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_LL128, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 12 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 11 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 12 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 2>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 2>, 2>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 2>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:7:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 7 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_2, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:33:7: note: in instantiation of member function 'Primitives, FanSymmetric<1>, 0, ProtoSimple<1, 1, 4>, 0>::Primitives' requested here 33 | prims(tid, nthreads, &ring->prev, &ring->next, work->sendbuff, work->recvbuff, work->redOpArg, 0, work->connIndex, work->connIndex); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/reduce.h:63:5: note: in instantiation of function template specialization '(anonymous namespace)::runRing, ProtoSimple<1, 1, 4>, 4>' requested here 63 | runRing(tid, nthreads, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:432:78: note: in instantiation of member function 'RunWorkColl, 1, 2, 4>::run' requested here 432 | if (tid < subtn) RunWorkColl().run(tid, subtn, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/reduce_sumpostdiv_u8.cpp:12:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 12 | DEFINE_ncclDevFunc(Reduce_RING_SIMPLE_SumPostDiv_u8_4, ncclFuncReduce, FuncSumPostDiv, uint8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for host. [ 94%] Building CXX object CMakeFiles/rccl.dir/git_version.cpp.o /usr/bin/hipcc -DCOMPILE_MSCCL_KERNEL -DENABLE_COLLTRACE -DENABLE_LL128 -DHIP_CONTIGUOUS_MEMORY -DHIP_UNCACHED_MEMORY -DNVTX_DISABLE -DNVTX_NO_IMPL -DROCM_VERSION=60402 -DUSE_PROF_API=1 -DUSE_ROCM_SMI64CONFIG -DUSE_ROCM_SMI_THREAD_ONLY_MUTEX -D__HIP_PLATFORM_AMD__=1 -Drccl_EXPORTS -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/network/unpack -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -fPIC -parallel-jobs=1 -Werror=uninitialized -Werror=sometimes-uninitialized -Wno-format-nonliteral -fgpu-rdc -fvisibility=hidden -mllvm --amdgpu-kernarg-preload-count=16 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT CMakeFiles/rccl.dir/git_version.cpp.o -MF CMakeFiles/rccl.dir/git_version.cpp.o.d -o CMakeFiles/rccl.dir/git_version.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/git_version.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 12 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 12 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 12 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 12 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 12 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:259:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runSend>' requested here 259 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:271:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runRecv>' requested here 271 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 8>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:257:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runSend>' requested here 257 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 8>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:269:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runRecv>' requested here 269 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:259:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runSend>' requested here 259 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:271:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runRecv>' requested here 271 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 10 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:1: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:12: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/collectives.h:15: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/device.h:14: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:75:7: warning: unused variable 'w' [-Wunused-variable] 75 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:174: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:14: warning: unused variable 'data1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:21: warning: unused variable 'flag1' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:28: warning: unused variable 'data2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll.h:145:35: warning: unused variable 'flag2' [-Wunused-variable] 145 | uint32_t data1, flag1, data2, flag2; | ^~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:175: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_ll128.h:80:5: warning: unused variable 'w' [-Wunused-variable] 80 | barrier_by_group(); | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:29:15: note: expanded from macro 'barrier_by_group' 29 | const int w = threadIdx.x/WARP_SIZE; \ | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/primitives.h:173: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 2>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 2>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:3:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 2>::run' requested here 3 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_2, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 2) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:45:7: note: in instantiation of member function 'Primitives, FanAsymmetric<0, 1>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 45 | prims(tid, tn, nullptr, &work->sendRank, work->sendAddr, nullptr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:261:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runSend>' requested here 261 | runSend>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: warning: initializer order does not match the declaration order [-Wreorder-ctor] 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ | tidInBlock(threadIdx.x nthreads(nthreads stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_ 671 | stepSize(stepSize_ == 0 ? ncclShmem.comm.buffSizes[NCCL_PROTO_SIMPLE]/NCCL_STEPS/sizeof(T) : stepSize_) { | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | group(group /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:103:7: note: in instantiation of member function 'Primitives, FanAsymmetric<1, 0>, 0, ProtoSimple<1, 1, 4>, 1>::Primitives' requested here 103 | prims(tid, tn, &work->recvRank, nullptr, nullptr, work->recvAddr, | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/sendrecv.h:273:9: note: in instantiation of function template specialization 'RunWorkBatch, 1, 2, 4>::runRecv>' requested here 273 | runRecv>(subtid, subtn, group, work); | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/gensrc/sendrecv_sum_i8.cpp:4:1: note: in instantiation of member function 'RunWorkBatch, 1, 2, 4>::run' requested here 4 | DEFINE_ncclDevFunc(SendRecv_RING_SIMPLE_Sum_i8_4, ncclFuncSendRecv, FuncSum, int8_t, NCCL_ALGO_RING, NCCL_PROTO_SIMPLE, 4) | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/common.h:611:62: note: expanded from macro 'DEFINE_ncclDevFunc' 611 | RunWorkBatch, algo, proto, unroll>().run(); \ | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:15: note: field 'nthreads' will be initialized after field 'tidInBlock' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/device/prims_simple.h:670:60: note: field 'group' will be initialized after field 'stepSize' 670 | tid(tid), nthreads(nthreads), tidInBlock(threadIdx.x), group(group), | ^~~~~~~~~~~ 12 warnings generated when compiling for host. [ 94%] Linking CXX shared library librccl.so /usr/bin/cmake -E cmake_link_script CMakeFiles/rccl.dir/link.txt --verbose=1 /usr/bin/cmake -E time /usr/bin/hipcc -fPIC -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -parallel-jobs=1 -Xoffload-linker -mllvm=-amdgpu-kernarg-preload-count=16 -Xlinker --dependency-file=CMakeFiles/rccl.dir/link.d -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -Wl,-soname,librccl.so.1 -o librccl.so.1.0 CMakeFiles/rccl.dir/hipify/src/bootstrap.cc.o CMakeFiles/rccl.dir/hipify/src/channel.cc.o CMakeFiles/rccl.dir/hipify/src/collectives.cc.o CMakeFiles/rccl.dir/hipify/src/debug.cc.o CMakeFiles/rccl.dir/hipify/src/enqueue.cc.o CMakeFiles/rccl.dir/hipify/src/group.cc.o CMakeFiles/rccl.dir/hipify/src/init.cc.o CMakeFiles/rccl.dir/hipify/src/init_nvtx.cc.o CMakeFiles/rccl.dir/hipify/src/net.cc.o CMakeFiles/rccl.dir/hipify/src/msccl.cc.o CMakeFiles/rccl.dir/hipify/src/proxy.cc.o CMakeFiles/rccl.dir/hipify/src/rccl_wrap.cc.o CMakeFiles/rccl.dir/hipify/src/register.cc.o CMakeFiles/rccl.dir/hipify/src/transport.cc.o CMakeFiles/rccl.dir/hipify/src/device/common.cu.cpp.o CMakeFiles/rccl.dir/hipify/src/device/onerank.cu.cpp.o CMakeFiles/rccl.dir/hipify/src/graph/connect.cc.o CMakeFiles/rccl.dir/hipify/src/graph/paths.cc.o CMakeFiles/rccl.dir/hipify/src/graph/rings.cc.o CMakeFiles/rccl.dir/hipify/src/graph/rome_models.cc.o CMakeFiles/rccl.dir/hipify/src/graph/search.cc.o CMakeFiles/rccl.dir/hipify/src/graph/topo.cc.o CMakeFiles/rccl.dir/hipify/src/graph/trees.cc.o CMakeFiles/rccl.dir/hipify/src/graph/tuning.cc.o CMakeFiles/rccl.dir/hipify/src/graph/xml.cc.o CMakeFiles/rccl.dir/hipify/src/misc/alt_rsmi.cc.o CMakeFiles/rccl.dir/hipify/src/misc/archinfo.cc.o CMakeFiles/rccl.dir/hipify/src/misc/argcheck.cc.o CMakeFiles/rccl.dir/hipify/src/misc/api_trace.cc.o CMakeFiles/rccl.dir/hipify/src/misc/ibvsymbols.cc.o CMakeFiles/rccl.dir/hipify/src/misc/ibvwrap.cc.o CMakeFiles/rccl.dir/hipify/src/misc/ipcsocket.cc.o CMakeFiles/rccl.dir/hipify/src/misc/npkit.cc.o CMakeFiles/rccl.dir/hipify/src/misc/nvmlwrap_stub.cc.o CMakeFiles/rccl.dir/hipify/src/misc/param.cc.o CMakeFiles/rccl.dir/hipify/src/misc/profiler.cc.o CMakeFiles/rccl.dir/hipify/src/misc/rocm_smi_wrap.cc.o CMakeFiles/rccl.dir/hipify/src/misc/rocmwrap.cc.o CMakeFiles/rccl.dir/hipify/src/misc/roctx.cc.o CMakeFiles/rccl.dir/hipify/src/misc/shmutils.cc.o CMakeFiles/rccl.dir/hipify/src/misc/signals.cc.o CMakeFiles/rccl.dir/hipify/src/misc/socket.cc.o CMakeFiles/rccl.dir/hipify/src/misc/strongstream.cc.o CMakeFiles/rccl.dir/hipify/src/misc/tuner.cc.o CMakeFiles/rccl.dir/hipify/src/misc/utils.cc.o CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_lifecycle.cc.o CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_parser.cc.o CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_setup.cc.o CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_status.cc.o CMakeFiles/rccl.dir/hipify/src/transport/coll_net.cc.o CMakeFiles/rccl.dir/hipify/src/transport/generic.cc.o CMakeFiles/rccl.dir/hipify/src/transport/net_tmp.cc.o CMakeFiles/rccl.dir/hipify/src/transport/net_ib.cc.o CMakeFiles/rccl.dir/hipify/src/transport/net_socket.cc.o CMakeFiles/rccl.dir/hipify/src/transport/nvls.cc.o CMakeFiles/rccl.dir/hipify/src/transport/p2p.cc.o CMakeFiles/rccl.dir/hipify/src/transport/shm.cc.o CMakeFiles/rccl.dir/hipify/gensrc/all_gather_sum_i8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/alltoall_pivot_sum_i8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/broadcast_sum_i8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/device_table.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/host_table.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_i8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_MinMax_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_i8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Prod_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_i8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/msccl_kernel_Sum_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_clang++: warning: argument unused during compilation: '-Xarch_host -fstack-protector-strong' [-Wunused-command-line-argument] clang++: warning: argument unused during compilation: '-Xarch_host -fcf-protection' [-Wunused-command-line-argument] clang++: warning: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-package-notes' [-Wunused-command-line-argument] Elapsed time (seconds): 8869.61 f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_bf16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_bf8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f16.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u32.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u64.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u8.cpp.o CMakeFiles/rccl.dir/hipify/gensrc/sendrecv_sum_i8.cpp.o CMakeFiles/rccl.dir/git_version.cpp.o -fgpu-rdc -ldl /usr/lib64/librocm_smi64.so.1.0 /usr/lib64/libamdhip64.so.6.4.43484 --hip-link --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 /usr/bin/cmake -E cmake_symlink_library librccl.so.1.0 librccl.so.1 librccl.so Extracting metadata from librccl.so /usr/bin/cmake -P /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/cmake/scripts/extract_metadata.cmake Warning: This tool has been DEPRECATED. Similar functionality is provided by llvm-objdump in the rocm-llvm package. Warning: This tool has been DEPRECATED. Similar functionality is provided by llvm-objdump in the rocm-llvm package. Warning: This tool has been DEPRECATED. Similar functionality is provided by llvm-objdump in the rocm-llvm package. gmake[2]: Leaving directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' [ 94%] Built target rccl /usr/bin/gmake -f test/CMakeFiles/rccl-UnitTests.dir/build.make test/CMakeFiles/rccl-UnitTests.dir/depend gmake[2]: Entering directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2 /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test/CMakeFiles/rccl-UnitTests.dir/DependInfo.cmake "--color=" gmake[2]: Leaving directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' /usr/bin/gmake -f test/CMakeFiles/rccl-UnitTests.dir/build.make test/CMakeFiles/rccl-UnitTests.dir/build gmake[2]: Entering directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' [ 94%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/AllGatherTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/AllGatherTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/AllGatherTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/AllGatherTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp [ 95%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/AllReduceTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/AllReduceTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/AllReduceTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/AllReduceTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllGatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for host. [ 95%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/AllToAllTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/AllToAllTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/AllToAllTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/AllToAllTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp 2 warnings generated when compiling for host. [ 95%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/AllToAllVTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/AllToAllVTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/AllToAllVTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/AllToAllVTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllVTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for host. [ 95%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/BroadcastTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/BroadcastTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/BroadcastTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/BroadcastTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/AllToAllTests.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for host. [ 96%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/GatherTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/GatherTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/GatherTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/GatherTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/BroadcastTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for host. [ 96%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/GroupCallTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/GroupCallTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/GroupCallTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/GroupCallTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GatherTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for host. [ 96%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/NonBlockingTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/NonBlockingTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/NonBlockingTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/NonBlockingTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/NonBlockingTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for host. [ 96%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/ReduceScatterTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/ReduceScatterTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/ReduceScatterTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/ReduceScatterTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/GroupCallTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for host. [ 96%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/ReduceTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/ReduceTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/ReduceTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/ReduceTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for host. [ 97%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/ScatterTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/ScatterTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/ScatterTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/ScatterTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ReduceTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for host. [ 97%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/SendRecvTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/SendRecvTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/SendRecvTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/SendRecvTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/ScatterTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for host. [ 97%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/StandaloneTests.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/StandaloneTests.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/StandaloneTests.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/StandaloneTests.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/SendRecvTests.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for host. [ 97%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/common/main.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/common/main.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/common/main.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/common/main.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:9: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/main.cpp:8: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 2 warnings generated when compiling for host. [ 98%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/common/CallCollectiveForked.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/common/CallCollectiveForked.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/common/CallCollectiveForked.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/common/CallCollectiveForked.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/StandaloneTests.cpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for host. [ 98%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/common/CollectiveArgs.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/common/CollectiveArgs.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/common/CollectiveArgs.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/common/CollectiveArgs.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:117:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 117 | write(childPipes[rank][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:123:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 123 | read(childPipes[rank][0], &id, sizeof(ncclUniqueId)); | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:129:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 129 | read(childPipes[0][0], &id, sizeof(ncclUniqueId)); //read from child0 | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:131:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 131 | write(childPipes[r][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:28:44: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 28 | if (this->options.scalarMode == 1) hipHostFree(this->localScalar.ptr); | ^~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx1030. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:117:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 117 | write(childPipes[rank][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:123:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 123 | read(childPipes[rank][0], &id, sizeof(ncclUniqueId)); | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:129:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 129 | read(childPipes[0][0], &id, sizeof(ncclUniqueId)); //read from child0 | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:131:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 131 | write(childPipes[r][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:28:44: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 28 | if (this->options.scalarMode == 1) hipHostFree(this->localScalar.ptr); | ^~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx1100. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:117:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 117 | write(childPipes[rank][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:123:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 123 | read(childPipes[rank][0], &id, sizeof(ncclUniqueId)); | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:129:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 129 | read(childPipes[0][0], &id, sizeof(ncclUniqueId)); //read from child0 | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:131:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 131 | write(childPipes[r][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:28:44: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 28 | if (this->options.scalarMode == 1) hipHostFree(this->localScalar.ptr); | ^~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx1101. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:117:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 117 | write(childPipes[rank][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:123:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 123 | read(childPipes[rank][0], &id, sizeof(ncclUniqueId)); | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:129:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 129 | read(childPipes[0][0], &id, sizeof(ncclUniqueId)); //read from child0 | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:131:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 131 | write(childPipes[r][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:28:44: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 28 | if (this->options.scalarMode == 1) hipHostFree(this->localScalar.ptr); | ^~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx1102. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:117:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 117 | write(childPipes[rank][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:123:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 123 | read(childPipes[rank][0], &id, sizeof(ncclUniqueId)); | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:129:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 129 | read(childPipes[0][0], &id, sizeof(ncclUniqueId)); //read from child0 | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:131:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 131 | write(childPipes[r][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:28:44: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 28 | if (this->options.scalarMode == 1) hipHostFree(this->localScalar.ptr); | ^~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx1200. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:117:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 117 | write(childPipes[rank][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:123:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 123 | read(childPipes[rank][0], &id, sizeof(ncclUniqueId)); | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:129:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 129 | read(childPipes[0][0], &id, sizeof(ncclUniqueId)); //read from child0 | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:131:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 131 | write(childPipes[r][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:28:44: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 28 | if (this->options.scalarMode == 1) hipHostFree(this->localScalar.ptr); | ^~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx1201. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:117:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 117 | write(childPipes[rank][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:123:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 123 | read(childPipes[rank][0], &id, sizeof(ncclUniqueId)); | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:129:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 129 | read(childPipes[0][0], &id, sizeof(ncclUniqueId)); //read from child0 | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:131:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 131 | write(childPipes[r][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:28:44: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 28 | if (this->options.scalarMode == 1) hipHostFree(this->localScalar.ptr); | ^~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx906. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:117:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 117 | write(childPipes[rank][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:123:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 123 | read(childPipes[rank][0], &id, sizeof(ncclUniqueId)); | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:129:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 129 | read(childPipes[0][0], &id, sizeof(ncclUniqueId)); //read from child0 | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:131:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 131 | write(childPipes[r][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:28:44: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 28 | if (thiIn file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ s->options.scalarMode == 1) hipHostFree(this->localScalar.ptr); | ^~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for gfx908. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:117:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 117 | write(childPipes[rank][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:123:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 123 | read(childPipes[rank][0], &id, sizeof(ncclUniqueId)); | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:129:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 129 | read(childPipes[0][0], &id, sizeof(ncclUniqueId)); //read from child0 | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:131:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 131 | write(childPipes[r][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:28:44: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 28 | if (this->options.scalarMode == 1) hipHostFree(this->localScalar.ptr); | ^~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx90a. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:117:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 117 | write(childPipes[rank][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:123:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 123 | read(childPipes[rank][0], &id, sizeof(ncclUniqueId)); | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:129:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 129 | read(childPipes[0][0], &id, sizeof(ncclUniqueId)); //read from child0 | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:131:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 131 | write(childPipes[r][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:28:44: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 28 | if (this->options.scalarMode == 1) hipHostFree(this->localScalar.ptr); | ^~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:2: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 2 warnings generated when compiling for gfx942. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:117:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 117 | write(childPipes[rank][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:123:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 123 | read(childPipes[rank][0], &id, sizeof(ncclUniqueId)); | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:129:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 129 | read(childPipes[0][0], &id, sizeof(ncclUniqueId)); //read from child0 | ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CallCollectiveForked.cpp:131:9: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 131 | write(childPipes[r][1], &id, sizeof(ncclUniqueId)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 warnings generated when compiling for host. [ 98%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/common/EnvVars.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/common/EnvVars.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/common/EnvVars.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/common/EnvVars.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.cpp:28:44: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 28 | if (this->options.scalarMode == 1) hipHostFree(this->localScalar.ptr); | ^~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:32:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 32 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:36:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 36 | hipGetDeviceProperties(&devProp, deviceId); | ^~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~ /usr/include/hip/hip_runtime_api.h:92:32: note: expanded from macro 'hipGetDeviceProperties' 92 | #define hipGetDeviceProperties hipGetDevicePropertiesR0600 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:75:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 75 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:107:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 107 | hipDeviceGetAttribute(&numDeviceCUs, hipDeviceAttributeMultiprocessorCount, deviceIdx); | ^~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:156:11: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 156 | hipGetDeviceCount(&numDev); | ^~~~~~~~~~~~~~~~~ ~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:161:13: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 161 | hipDeviceGetPCIBusId(busIdStr, sizeof(busIdStr), dev); | ^~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2 warnings generated when compiling for host. [ 98%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/common/PrepDataFuncs.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/common/PrepDataFuncs.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/common/PrepDataFuncs.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/common/PrepDataFuncs.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PrepDataFuncs.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 8 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PrepDataFuncs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:32:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 32 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:36:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 36 | hipGetDeviceProperties(&devProp, deviceId); | ^~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~ /usr/include/hip/hip_runtime_api.h:92:32: note: expanded from macro 'hipGetDeviceProperties' 92 | #define hipGetDeviceProperties hipGetDevicePropertiesR0600 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:75:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 75 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:107:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 107 | hipDeviceGetAttribute(&numDeviceCUs, hipDeviceAttributeMultiprocessorCount, deviceIdx); | ^~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:156:11: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 156 | hipGetDeviceCount(&numDev); | ^~~~~~~~~~~~~~~~~ ~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:161:13: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 161 | hipDeviceGetPCIBusId(busIdStr, sizeof(busIdStr), dev); | ^~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 8 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PrepDataFuncs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:32:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 32 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:36:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 36 | hipGetDeviceProperties(&devProp, deviceId); | ^~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~ /usr/include/hip/hip_runtime_api.h:92:32: note: expanded from macro 'hipGetDeviceProperties' 92 | #define hipGetDeviceProperties hipGetDevicePropertiesR0600 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:75:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 75 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:107:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 107 | hipDeviceGetAttribute(&numDeviceCUs, hipDeviceAttributeMultiprocessorCount, deviceIdx); | ^~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:156:11: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 156 | hipGetDeviceCount(&numDev); | ^~~~~~~~~~~~~~~~~ ~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:161:13: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 161 | hipDeviceGetPCIBusId(busIdStr, sizeof(busIdStr), dev); | ^~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 8 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PrepDataFuncs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:32:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 32 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:36:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 36 | hipGetDeviceProperties(&devProp, deviceId); | ^~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~ /usr/include/hip/hip_runtime_api.h:92:32: note: expanded from macro 'hipGetDeviceProperties' 92 | #define hipGetDeviceProperties hipGetDevicePropertiesR0600 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:75:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 75 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:107:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 107 | hipDeviceGetAttribute(&numDeviceCUs, hipDeviceAttributeMultiprocessorCount, deviceIdx); | ^~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:156:11: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 156 | hipGetDeviceCount(&numDev); | ^~~~~~~~~~~~~~~~~ ~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:161:13: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 161 | hipDeviceGetPCIBusId(busIdStr, sizeof(busIdStr), dev); | ^~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PrepDataFuncs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 8 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PrepDataFuncs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:32:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 32 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:36:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 36 | hipGetDeviceProperties(&devProp, deviceId); | ^~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~ /usr/include/hip/hip_runtime_api.h:92:32: note: expanded from macro 'hipGetDeviceProperties' 92 | #define hipGetDeviceProperties hipGetDevicePropertiesR0600 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:75:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 75 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:107:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 107 | hipDeviceGetAttribute(&numDeviceCUs, hipDeviceAttributeMultiprocessorCount, deviceIdx); | ^~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:156:11: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 156 | hipGetDeviceCount(&numDev); | ^~~~~~~~~~~~~~~~~ ~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:161:13: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 161 | hipDeviceGetPCIBusId(busIdStr, sizeof(busIdStr), dev); | ^~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 8 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PrepDataFuncs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:32:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 32 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:36:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 36 | hipGetDeviceProperties(&devProp, deviceId); | ^~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~ /usr/include/hip/hip_runtime_api.h:92:32: note: expanded from macro 'hipGetDeviceProperties' 92 | #define hipGetDeviceProperties hipGetDevicePropertiesR0600 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:75:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 75 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:107:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 107 | hipDeviceGetAttribute(&numDeviceCUs, hipDeviceAttributeMultiprocessorCount, deviceIdx); | ^~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:156:11: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 156 | hipGetDeviceCount(&numDev); | ^~~~~~~~~~~~~~~~~ ~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:161:13: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 161 | hipDeviceGetPCIBusId(busIdStr, sizeof(busIdStr), dev); | ^~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 8 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PrepDataFuncs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:32:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 32 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:36:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 36 | hipGetDeviceProperties(&devProp, deviceId); | ^~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~ /usr/include/hip/hip_runtime_api.h:92:32: note: expanded from macro 'hipGetDeviceProperties' 92 | #define hipGetDeviceProperties hipGetDevicePropertiesR0600 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:75:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 75 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:107:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 107 | hipDeviceGetAttribute(&numDeviceCUs, hipDeviceAttributeMultiprocessorCount, deviceIdx); | ^~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:156:11: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 156 | hipGetDeviceCount(&numDev); | ^~~~~~~~~~~~~~~~~ ~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:161:13: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 161 | hipDeviceGetPCIBusId(busIdStr, sizeof(busIdStr), dev); | ^~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PrepDataFuncs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 8 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PrepDataFuncs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:32:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 32 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:36:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 36 | hipGetDeviceProperties(&devProp, deviceId); | ^~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~ /usr/include/hip/hip_runtime_api.h:92:32: note: expanded from macro 'hipGetDeviceProperties' 92 | #define hipGetDeviceProperties hipGetDevicePropertiesR0600 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:75:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 75 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:107:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 107 | hipDeviceGetAttribute(&numDeviceCUs, hipDeviceAttributeMultiprocessorCount, deviceIdx); | ^~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:156:11: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 156 | hipGetDeviceCount(&numDev); | ^~~~~~~~~~~~~~~~~ ~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:161:13: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 161 | hipDeviceGetPCIBusId(busIdStr, sizeof(busIdStr), dev); | ^~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 8 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PrepDataFuncs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:32:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 32 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:36:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 36 | hipGetDeviceProperties(&devProp, deviceId); | ^~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~ /usr/include/hip/hip_runtime_api.h:92:32: note: expanded from macro 'hipGetDeviceProperties' 92 | #define hipGetDeviceProperties hipGetDevicePropertiesR0600 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:75:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 75 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:107:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 107 | hipDeviceGetAttribute(&numDeviceCUs, hipDeviceAttributeMultiprocessorCount, deviceIdx); | ^~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:156:11: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 156 | hipGetDeviceCount(&numDev); | ^~~~~~~~~~~~~~~~~ ~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:161:13: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 161 | hipDeviceGetPCIBusId(busIdStr, sizeof(busIdStr), dev); | ^~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 8 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PrepDataFuncs.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for host. [ 99%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/common/PtrUnion.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/common/PtrUnion.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/common/PtrUnion.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/common/PtrUnion.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:32:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 32 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:36:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 36 | hipGetDeviceProperties(&devProp, deviceId); | ^~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~ /usr/include/hip/hip_runtime_api.h:92:32: note: expanded from macro 'hipGetDeviceProperties' 92 | #define hipGetDeviceProperties hipGetDevicePropertiesR0600 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:75:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 75 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:107:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 107 | hipDeviceGetAttribute(&numDeviceCUs, hipDeviceAttributeMultiprocessorCount, deviceIdx); | ^~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:156:11: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 156 | hipGetDeviceCount(&numDev); | ^~~~~~~~~~~~~~~~~ ~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:161:13: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 161 | hipDeviceGetPCIBusId(busIdStr, sizeof(busIdStr), dev); | ^~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 8 warnings generated when compiling for gfx942. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:102:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 102 | hipFree(this->ptr); | ^~~~~~~ ~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:125:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 125 | hipStreamSynchronize(NULL); | ^~~~~~~~~~~~~~~~~~~~ ~~~~ 3 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:8: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:102:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 102 | hipFree(this->ptr); | ^~~~~~~ ~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:125:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 125 | hipStreamSynchronize(NULL); | ^~~~~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:32:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 32 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:36:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 36 | hipGetDeviceProperties(&devProp, deviceId); | ^~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~ /usr/include/hip/hip_runtime_api.h:92:32: note: expanded from macro 'hipGetDeviceProperties' 92 | #define hipGetDeviceProperties hipGetDevicePropertiesR0600 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:75:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 75 | hipGetDeviceCount(&dev); | ^~~~~~~~~~~~~~~~~ ~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:107:7: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 107 | hipDeviceGetAttribute(&numDeviceCUs, hipDeviceAttributeMultiprocessorCount, deviceIdx); | ^~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:156:11: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 156 | hipGetDeviceCount(&numDev); | ^~~~~~~~~~~~~~~~~ ~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:161:13: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 161 | hipDeviceGetPCIBusId(busIdStr, sizeof(busIdStr), dev); | ^~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.cpp:7: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:102:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 102 | hipFree(this->ptr); | ^~~~~~~ ~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:125:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 125 | hipStreamSynchronize(NULL); | ^~~~~~~~~~~~~~~~~~~~ ~~~~ 8 warnings generated when compiling for host. [ 99%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/common/TestBed.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/common/TestBed.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/common/TestBed.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/common/TestBed.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp 3 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:102:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 102 | hipFree(this->ptr); | ^~~~~~~ ~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:125:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 125 | hipStreamSynchronize(NULL); | ^~~~~~~~~~~~~~~~~~~~ ~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 3 warnings generated when compiling for gfx1102. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:777:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 777 | scanf("%*c"); | ^~~~~ ~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 3 warnings generated when compiling for /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:102:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 102 | hipFree(this->ptr); | ^~~~~~~ ~~~~~~~~~ gfx1030. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:125:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 125 | hipStreamSynchronize(NULL); | ^~~~~~~~~~~~~~~~~~~~ ~~~~ 3 warnings generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:102:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 102 | hipFree(this->ptr); | ^~~~~~~ ~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:125:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 125 | hipStreamSynchronize(NULL); | ^~~~~~~~~~~~~~~~~~~~ ~~~~ 3 warnings generated when compiling for gfx1201. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:777:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 777 | scanf("%*c"); | ^~~~~ ~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 3 warnings generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:102:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 102 | hipFree(this->ptr); | ^~~~~~~ ~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:125:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 125 | hipStreamSynchronize(NULL); | ^~~~~~~~~~~~~~~~~~~~ ~~~~ 3 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:102:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 102 | hipFree(this->ptr); | ^~~~~~~ ~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:125:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 125 | hipStreamSynchronize(NULL); | ^~~~~~~~~~~~~~~~~~~~ ~~~~ 3 warnings generated when compiling for gfx908. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:777:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 777 | scanf("%*c"); | ^~~~~ ~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 3 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:102:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 102 | hipFree(this->ptr); | ^~~~~~~ ~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:125:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 125 | hipStreamSynchronize(NULL); | ^~~~~~~~~~~~~~~~~~~~ ~~~~ 3 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:102:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 102 | hipFree(this->ptr); | ^~~~~~~ ~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:125:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 125 | hipStreamSynchronize(NULL); | ^~~~~~~~~~~~~~~~~~~~ ~~~~ 3 warnings generated when compiling for gfx942. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:777:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 777 | scanf("%*c"); | ^~~~~ ~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 3 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:102:9: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 102 | hipFree(this->ptr); | ^~~~~~~ ~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.cpp:125:5: warning: ignoring return value of function declared with 'nodiscard' attribute [-Wunused-result] 125 | hipStreamSynchronize(NULL); | ^~~~~~~~~~~~~~~~~~~~ ~~~~ 3 warnings generated when compiling for host. [ 99%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/common/TestBedChild.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/common/TestBedChild.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/common/TestBedChild.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/common/TestBedChild.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:777:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 777 | scanf("%*c"); | ^~~~~ ~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.hpp:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:160:5: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 160 | write(childWriteFd, &id, sizeof(id)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:519:13: warning: unused variable 'errCodeVal' [-Wunused-variable] 519 | auto& errCodeVal = reinterpret_cast(errCode); | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 3 warnings generated when compiling for gfx1200. 3 warnings generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.hpp:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:160:5: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 160 | write(childWriteFd, &id, sizeof(id)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:519:13: warning: unused variable 'errCodeVal' [-Wunused-variable] 519 | auto& errCodeVal = reinterpret_cast(errCode); | ^~~~~~~~~~ 3 warnings generated when compiling for gfx1100. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:777:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 777 | scanf("%*c"); | ^~~~~ ~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 3 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.hpp:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:160:5: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 160 | write(childWriteFd, &id, sizeof(id)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:519:13: warning: unused variable 'errCodeVal' [-Wunused-variable] 519 | auto& errCodeVal = reinterpret_cast(errCode); | ^~~~~~~~~~ 3 warnings generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:777:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 777 | scanf("%*c"); | ^~~~~ ~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.hpp:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:160:5: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 160 | write(childWriteFd, &id, sizeof(id)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:519:13: warning: unused variable 'errCodeVal' [-Wunused-variable] 519 | auto& errCodeVal = reinterpret_cast(errCode); | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 3 warnings generated when compiling for gfx906. 3 warnings generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.hpp:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:160:5: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 160 | write(childWriteFd, &id, sizeof(id)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:519:13: warning: unused variable 'errCodeVal' [-Wunused-variable] 519 | auto& errCodeVal = reinterpret_cast(errCode); | ^~~~~~~~~~ 3 warnings generated when compiling for gfx1200. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:777:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 777 | scanf("%*c"); | ^~~~~ ~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 3 warnings generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.hpp:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:160:5: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 160 | write(childWriteFd, &id, sizeof(id)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:519:13: warning: unused variable 'errCodeVal' [-Wunused-variable] 519 | auto& errCodeVal = reinterpret_cast(errCode); | ^~~~~~~~~~ 3 warnings generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.hpp:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:777:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 777 | scanf("%*c"); | ^~~~~ ~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:160:5: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 160 | write(childWriteFd, &id, sizeof(id)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:519:13: warning: unused variable 'errCodeVal' [-Wunused-variable] 519 | auto& errCodeVal = reinterpret_cast(errCode); | ^~~~~~~~~~ 3 warnings generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 3 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.hpp:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:160:5: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 160 | write(childWriteFd, &id, sizeof(id)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:519:13: warning: unused variable 'errCodeVal' [-Wunused-variable] 519 | auto& errCodeVal = reinterpret_cast(errCode); | ^~~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 3 warnings generated when compiling for gfx908. /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:777:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 777 | scanf("%*c"); | ^~~~~ ~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ 3 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.hpp:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:160:5: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 160 | write(childWriteFd, &id, sizeof(id)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:519:13: warning: unused variable 'errCodeVal' [-Wunused-variable] 519 | auto& errCodeVal = reinterpret_cast(errCode); | ^~~~~~~~~~ 3 warnings generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:10: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.hpp:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:160:5: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 160 | write(childWriteFd, &id, sizeof(id)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:519:13: warning: unused variable 'errCodeVal' [-Wunused-variable] 519 | auto& errCodeVal = reinterpret_cast(errCode); | ^~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:777:7: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 777 | scanf("%*c"); | ^~~~~ ~~~~~ 3 warnings generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBed.hpp:12: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/EnvVars.hpp:15:23: warning: unused function 'CountGpus' [-Wunused-function] 15 | static hsa_status_t CountGpus(hsa_agent_t agent, void* data); | ^~~~~~~~~ In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.hpp:11: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:160:5: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result] 160 | write(childWriteFd, &id, sizeof(id)); | ^~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/TestBedChild.cpp:519:13: warning: unused variable 'errCodeVal' [-Wunused-variable] 519 | auto& errCodeVal = reinterpret_cast(errCode); | ^~~~~~~~~~ 3 warnings generated when compiling for host. [ 99%] Building CXX object test/CMakeFiles/rccl-UnitTests.dir/common/StandaloneUtils.cpp.o cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/hipcc -DENABLE_LL128 -DGTEST_LINKED_AS_SHARED_LIBRARY=1 -DROCM_PATH=\"/usr\" -DROCM_VERSION=60402 -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/./common -I/usr -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/include -I/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -std=c++17 -x hip --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 -MD -MT test/CMakeFiles/rccl-UnitTests.dir/common/StandaloneUtils.cpp.o -MF CMakeFiles/rccl-UnitTests.dir/common/StandaloneUtils.cpp.o.d -o CMakeFiles/rccl-UnitTests.dir/common/StandaloneUtils.cpp.o -c /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/StandaloneUtils.cpp 3 warnings generated when compiling for host. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/StandaloneUtils.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1030. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/StandaloneUtils.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1100. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/StandaloneUtils.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1101. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/StandaloneUtils.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1102. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/StandaloneUtils.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1200. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/StandaloneUtils.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx1201. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/StandaloneUtils.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx906. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/StandaloneUtils.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx908. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/StandaloneUtils.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx90a. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/StandaloneUtils.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for gfx942. In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/StandaloneUtils.cpp:6: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/CollectiveArgs.hpp:7: In file included from /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/test/common/PtrUnion.hpp:10: /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/hipify/src/include/rccl_float8.h:77:18: warning: unused variable 'y' [-Wunused-variable] 77 | uint32_t y, head, mantissa; | ^ 1 warning generated when compiling for host. [100%] Linking CXX executable rccl-UnitTests cd /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/test && /usr/bin/cmake -E cmake_link_script CMakeFiles/rccl-UnitTests.dir/link.txt --verbose=1 clang++: warning: argument unused during compilation: '-Xarch_host -fstack-protector-strong' [-Wunused-command-line-argument] clang++: warning: argument unused during compilation: '-Xarch_host -fcf-protection' [-Wunused-command-line-argument] clang++: warning: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-package-notes' [-Wunused-command-line-argument] /usr/bin/hipcc -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -O2 -g -DNDEBUG -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -rdynamic -Xlinker --dependency-file=CMakeFiles/rccl-UnitTests.dir/link.d "CMakeFiles/rccl-UnitTests.dir/AllGatherTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/AllReduceTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/AllToAllTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/AllToAllVTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/BroadcastTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/GatherTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/GroupCallTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/NonBlockingTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/ReduceScatterTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/ReduceTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/ScatterTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/SendRecvTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/StandaloneTests.cpp.o" "CMakeFiles/rccl-UnitTests.dir/common/main.cpp.o" "CMakeFiles/rccl-UnitTests.dir/common/CallCollectiveForked.cpp.o" "CMakeFiles/rccl-UnitTests.dir/common/CollectiveArgs.cpp.o" "CMakeFiles/rccl-UnitTests.dir/common/EnvVars.cpp.o" "CMakeFiles/rccl-UnitTests.dir/common/PrepDataFuncs.cpp.o" "CMakeFiles/rccl-UnitTests.dir/common/PtrUnion.cpp.o" "CMakeFiles/rccl-UnitTests.dir/common/TestBed.cpp.o" "CMakeFiles/rccl-UnitTests.dir/common/TestBedChild.cpp.o" "CMakeFiles/rccl-UnitTests.dir/common/StandaloneUtils.cpp.o" -o rccl-UnitTests /usr/lib64/libgtest_main.so.1.15.2 /usr/lib64/libhsa-runtime64.so.1.15.0 ../librccl.so.1.0 /usr/lib64/libgtest.so.1.15.2 --hip-link --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 --offload-arch=gfx1200 --offload-arch=gfx1201 /usr/lib64/libamdhip64.so.6.4.43484 gmake[2]: Leaving directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' [100%] Built target rccl-UnitTests gmake[1]: Leaving directory '/builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build' /usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/redhat-linux-build/CMakeFiles 0 + RPM_EC=0 ++ jobs -p + exit 0 Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.joJ5OL + umask 022 + cd /builddir/build/BUILD/rccl-6.4.2-build + '[' /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT '!=' / ']' + rm -rf /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT ++ dirname /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT + mkdir -p /builddir/build/BUILD/rccl-6.4.2-build + mkdir /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT + CFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -Xarch_host -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -Xarch_host -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config /usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -Wl,-z,now -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=hipcc + export CC + CXX=hipcc + export CXX + cd rccl-rocm-6.4.2 + DESTDIR=/builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT + /usr/bin/cmake --install redhat-linux-build -- Install configuration: "RelWithDebInfo" -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/lib64/librccl.so.1.0 -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/lib64/librccl.so.1 -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/lib64/librccl.so -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/include/rccl/rccl.h -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/include/rccl/nccl_net.h -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/include/rccl/amd_detail/api_trace.h -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms/allreduce-allpairs-8n-ll-32tb-op.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms/allreduce-allpairs-8n-ll-32tb.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms/allreduce-allpairs-8n-ll-64tb-op.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms/allreduce-allpairs-8n-ll-64tb.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms/allreduce-allpairs-8n-simple-op.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms/allreduce-allpairs-8n-simple.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms/allreduce-allpairs-8n-simple_2.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms/alltoall-8n-0-9kb.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms/alltoall-8n-190kb-512kb.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms/alltoall-8n-512kb-7mb.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms/alltoall-8n-7mb-43mb.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-algorithms/alltoall-8n-9kb-190kb.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-unit-test-algorithms -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-unit-test-algorithms/all-reduce-ring-ll.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-unit-test-algorithms/all-reduce-ring-ll128.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/rccl/msccl-unit-test-algorithms/all-reduce-ring-simple.xml -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/lib64/cmake/rccl/rccl-targets.cmake -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/lib64/cmake/rccl/rccl-targets-relwithdebinfo.cmake -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/lib64/cmake/rccl/rccl-config.cmake -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/lib64/cmake/rccl/rccl-config-version.cmake -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/doc/rccl/LICENSE.txt -- Installing: /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/bin/rccl-UnitTests + '[' -f /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/doc/rccl/LICENSE.txt ']' + rm /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/doc/rccl/LICENSE.txt + /usr/bin/find-debuginfo -j2 --strict-build-id -m -i --build-id-seed 6.4.2-5.fc43 --unique-debug-suffix -6.4.2-5.fc43.x86_64 --unique-debug-src-base rccl-6.4.2-5.fc43.x86_64 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 110000000 -S debugsourcefiles.list /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2 find-debuginfo: starting Extracting debug info from 2 files DWARF-compressing 2 files dwz: ./usr/bin/rccl-UnitTests-6.4.2-5.fc43.x86_64.debug: Unknown debugging section .debug_str_offsets dwz: ./usr/lib64/librccl.so.1.0-6.4.2-5.fc43.x86_64.debug: Unknown debugging section .debug_str_offsets dwz: Too few files for multifile optimization sepdebugcrcfix: Updated 0 CRC32s, 2 CRC32s did match. Creating .debug symlinks for symlinks to ELF files Copying sources found by 'debugedit -l' to /usr/src/debug/rccl-6.4.2-5.fc43.x86_64 find-debuginfo: done + /usr/lib/rpm/check-buildroot + /usr/lib/rpm/redhat/brp-ldconfig + /usr/lib/rpm/brp-compress + /usr/lib/rpm/redhat/brp-strip-lto /usr/bin/strip + /usr/lib/rpm/redhat/brp-mangle-shebangs + /usr/lib/rpm/brp-remove-la-files + /usr/lib/rpm/redhat/brp-python-rpm-in-distinfo + env /usr/lib/rpm/redhat/brp-python-bytecompile '' 1 0 -j2 + /usr/lib/rpm/redhat/brp-python-hardlink + /usr/bin/add-determinism --brp -j2 /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT Scanned 43 directories and 348 files, processed 0 inodes, 0 modified (0 replaced + 0 rewritten), 0 unsupported format, 0 errors Reading /builddir/build/BUILD/rccl-6.4.2-build/SPECPARTS/rpm-debuginfo.specpart Processing files: rccl-6.4.2-5.fc43.x86_64 Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.RUXci4 + umask 022 + cd /builddir/build/BUILD/rccl-6.4.2-build + cd rccl-rocm-6.4.2 + LICENSEDIR=/builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/licenses/rccl + export LC_ALL=C.UTF-8 + LC_ALL=C.UTF-8 + export LICENSEDIR + /usr/bin/mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/licenses/rccl + cp -pr /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/LICENSE.txt /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/licenses/rccl + RPM_EC=0 ++ jobs -p + exit 0 Provides: librccl.so.1()(64bit) rccl = 6.4.2-5.fc43 rccl(x86-64) = 6.4.2-5.fc43 Requires(interp): /sbin/ldconfig /sbin/ldconfig Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires(post): /sbin/ldconfig Requires(postun): /sbin/ldconfig Requires: ld-linux-x86-64.so.2()(64bit) ld-linux-x86-64.so.2(GLIBC_2.3)(64bit) libamdhip64.so.6()(64bit) libamdhip64.so.6(hip_4.2)(64bit) libamdhip64.so.6(hip_4.3)(64bit) libamdhip64.so.6(hip_4.5)(64bit) libamdhip64.so.6(hip_5.0)(64bit) libamdhip64.so.6(hip_5.3)(64bit) libamdhip64.so.6(hip_6.0)(64bit) libc.so.6()(64bit) libc.so.6(GLIBC_2.10)(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.16)(64bit) libc.so.6(GLIBC_2.17)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_2.3)(64bit) libc.so.6(GLIBC_2.3.2)(64bit) libc.so.6(GLIBC_2.3.4)(64bit) libc.so.6(GLIBC_2.32)(64bit) libc.so.6(GLIBC_2.33)(64bit) libc.so.6(GLIBC_2.34)(64bit) libc.so.6(GLIBC_2.38)(64bit) libc.so.6(GLIBC_2.4)(64bit) libc.so.6(GLIBC_2.42)(64bit) libc.so.6(GLIBC_2.6)(64bit) libc.so.6(GLIBC_2.7)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_12.0.0)(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libm.so.6()(64bit) libm.so.6(GLIBC_2.2.5)(64bit) librocm_smi64.so.1()(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.7)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.18)(64bit) libstdc++.so.6(GLIBCXX_3.4.19)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.22)(64bit) libstdc++.so.6(GLIBCXX_3.4.26)(64bit) libstdc++.so.6(GLIBCXX_3.4.29)(64bit) libstdc++.so.6(GLIBCXX_3.4.30)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) libstdc++.so.6(GLIBCXX_3.4.9)(64bit) Processing files: rccl-devel-6.4.2-5.fc43.x86_64 Executing(%doc): /bin/sh -e /var/tmp/rpm-tmp.W2PgVU + umask 022 + cd /builddir/build/BUILD/rccl-6.4.2-build + cd rccl-rocm-6.4.2 + DOCDIR=/builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/doc/rccl-devel + export LC_ALL=C.UTF-8 + LC_ALL=C.UTF-8 + export DOCDIR + /usr/bin/mkdir -p /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/doc/rccl-devel + cp -pr /builddir/build/BUILD/rccl-6.4.2-build/rccl-rocm-6.4.2/README.md /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT/usr/share/doc/rccl-devel + RPM_EC=0 ++ jobs -p + exit 0 Provides: cmake(rccl) = 2.22.3 rccl-devel = 6.4.2-5.fc43 rccl-devel(x86-64) = 6.4.2-5.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: cmake-filesystem(x86-64) librccl.so.1()(64bit) Processing files: rccl-data-6.4.2-5.fc43.noarch Provides: rccl-data = 6.4.2-5.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Processing files: rccl-test-6.4.2-5.fc43.x86_64 Provides: rccl-test = 6.4.2-5.fc43 rccl-test(x86-64) = 6.4.2-5.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: libamdhip64.so.6()(64bit) libamdhip64.so.6(hip_4.2)(64bit) libamdhip64.so.6(hip_4.3)(64bit) libamdhip64.so.6(hip_6.0)(64bit) libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_2.3.4)(64bit) libc.so.6(GLIBC_2.32)(64bit) libc.so.6(GLIBC_2.34)(64bit) libc.so.6(GLIBC_2.38)(64bit) libc.so.6(GLIBC_2.4)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_12.0.0)(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libgtest.so.1.15.2()(64bit) librccl.so.1()(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.14)(64bit) libstdc++.so.6(GLIBCXX_3.4.15)(64bit) libstdc++.so.6(GLIBCXX_3.4.18)(64bit) libstdc++.so.6(GLIBCXX_3.4.19)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.26)(64bit) libstdc++.so.6(GLIBCXX_3.4.29)(64bit) libstdc++.so.6(GLIBCXX_3.4.30)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) libstdc++.so.6(GLIBCXX_3.4.9)(64bit) Processing files: rccl-debugsource-6.4.2-5.fc43.x86_64 Provides: rccl-debugsource = 6.4.2-5.fc43 rccl-debugsource(x86-64) = 6.4.2-5.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Processing files: rccl-debuginfo-6.4.2-5.fc43.x86_64 Provides: debuginfo(build-id) = 50fe5af295dad5630d132903bb6c295e5e5f594d librccl.so.1.0-6.4.2-5.fc43.x86_64.debug()(64bit) rccl-debuginfo = 6.4.2-5.fc43 rccl-debuginfo(x86-64) = 6.4.2-5.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Recommends: rccl-debugsource(x86-64) = 6.4.2-5.fc43 Processing files: rccl-test-debuginfo-6.4.2-5.fc43.x86_64 Provides: debuginfo(build-id) = fff2afb310c3eea40c7bace111a3f530279876bb rccl-test-debuginfo = 6.4.2-5.fc43 rccl-test-debuginfo(x86-64) = 6.4.2-5.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Recommends: rccl-debugsource(x86-64) = 6.4.2-5.fc43 Checking for unpackaged file(s): /usr/lib/rpm/check-files /builddir/build/BUILD/rccl-6.4.2-build/BUILDROOT Wrote: /builddir/build/RPMS/rccl-data-6.4.2-5.fc43.noarch.rpm Wrote: /builddir/build/RPMS/rccl-debuginfo-6.4.2-5.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/rccl-test-debuginfo-6.4.2-5.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/rccl-debugsource-6.4.2-5.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/rccl-test-6.4.2-5.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/rccl-devel-6.4.2-5.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/rccl-6.4.2-5.fc43.x86_64.rpm Executing(rmbuild): /bin/sh -e /var/tmp/rpm-tmp.OApxaU + umask 022 + cd /builddir/build/BUILD/rccl-6.4.2-build + test -d /builddir/build/BUILD/rccl-6.4.2-build + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w /builddir/build/BUILD/rccl-6.4.2-build + rm -rf /builddir/build/BUILD/rccl-6.4.2-build + RPM_EC=0 ++ jobs -p + exit 0 Finish: rpmbuild rccl-6.4.2-5.fc43.src.rpm Finish: build phase for rccl-6.4.2-5.fc43.src.rpm INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-43-x86_64-1759943729.176052/root/var/log/dnf5.log INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz /bin/tar: Removing leading `/' from member names INFO: Done(/var/lib/copr-rpmbuild/results/rccl-6.4.2-5.fc43.src.rpm) Config(child) 244 minutes 55 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot Finish: run Running RPMResults tool Package info: { "packages": [ { "name": "rccl-debugsource", "epoch": null, "version": "6.4.2", "release": "5.fc43", "arch": "x86_64" }, { "name": "rccl-debuginfo", "epoch": null, "version": "6.4.2", "release": "5.fc43", "arch": "x86_64" }, { "name": "rccl-test", "epoch": null, "version": "6.4.2", "release": "5.fc43", "arch": "x86_64" }, { "name": "rccl-devel", "epoch": null, "version": "6.4.2", "release": "5.fc43", "arch": "x86_64" }, { "name": "rccl-test-debuginfo", "epoch": null, "version": "6.4.2", "release": "5.fc43", "arch": "x86_64" }, { "name": "rccl", "epoch": null, "version": "6.4.2", "release": "5.fc43", "arch": "src" }, { "name": "rccl-data", "epoch": null, "version": "6.4.2", "release": "5.fc43", "arch": "noarch" }, { "name": "rccl", "epoch": null, "version": "6.4.2", "release": "5.fc43", "arch": "x86_64" } ] } RPMResults finished